Theory and Practice of Adaptive Filters
Transcript of Theory and Practice of Adaptive Filters
![Page 1: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/1.jpg)
Theory and Practice of Adaptive Filters
Sam McCauley
September 21, 2021
Williams College
![Page 2: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/2.jpg)
Filters
• Filter: succinct data structure used to speed up database
queries
• Example: Bloom Filter
• Let’s say our database stores a set of keys S .
1
![Page 3: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/3.jpg)
Filters
• Filter: succinct data structure used to speed up database
queries
• Example: Bloom Filter
• Let’s say our database stores a set of keys S .
1
![Page 4: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/4.jpg)
Filters
• Filter: succinct data structure used to speed up database
queries
• Example: Bloom Filter
• Let’s say our database stores a set of keys S .
1
![Page 5: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/5.jpg)
Common filter usage
qLookup(q)
Database queries are very expensive!
2
![Page 6: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/6.jpg)
Database queries are often “unnecessary”
Many workloads involve mostly “negative” queries: queries to keys
not stored in the database. (query q /∈ S)
• Classic example: dictionary of unusually-hyphenated words for
a spellchecker.
• Checking if key already exists before an insert (deduplication
in general)
• Check for malicious URLs
• Table with many empty entries
Classic filter usage: succinct data structure that will allow us to
“filter out” negative queries.
3
![Page 7: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/7.jpg)
Database queries are often “unnecessary”
Many workloads involve mostly “negative” queries: queries to keys
not stored in the database. (query q /∈ S)
• Classic example: dictionary of unusually-hyphenated words for
a spellchecker.
• Checking if key already exists before an insert (deduplication
in general)
• Check for malicious URLs
• Table with many empty entries
Classic filter usage: succinct data structure that will allow us to
“filter out” negative queries.
3
![Page 8: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/8.jpg)
Database queries are often “unnecessary”
Many workloads involve mostly “negative” queries: queries to keys
not stored in the database. (query q /∈ S)
• Classic example: dictionary of unusually-hyphenated words for
a spellchecker.
• Checking if key already exists before an insert (deduplication
in general)
• Check for malicious URLs
• Table with many empty entries
Classic filter usage: succinct data structure that will allow us to
“filter out” negative queries.
3
![Page 9: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/9.jpg)
Database queries are often “unnecessary”
Many workloads involve mostly “negative” queries: queries to keys
not stored in the database. (query q /∈ S)
• Classic example: dictionary of unusually-hyphenated words for
a spellchecker.
• Checking if key already exists before an insert (deduplication
in general)
• Check for malicious URLs
• Table with many empty entries
Classic filter usage: succinct data structure that will allow us to
“filter out” negative queries.
3
![Page 10: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/10.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 11: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/11.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 12: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/12.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 13: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/13.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 14: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/14.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 15: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/15.jpg)
Filters
• Store compact
representation of a set S
• Answer membership
queries “is q ∈ S?”
• Lossy compression: can
make mistakes
• No false negatives: q /∈ S
is always correct.
• False positive with
probability ε.
• Space: n log2 1/ε + O(n)
bits.
4
![Page 16: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/16.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 17: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/17.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 18: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/18.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 19: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/19.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?
Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 20: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/20.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?
Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.
If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 21: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/21.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?
Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 22: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/22.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?
Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.
If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 23: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/23.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?
No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 24: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/24.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?
No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 25: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/25.jpg)
Common filter usage
0 1 0 0 1 0 0 1
q
Is q ∈ S?Yes, q ∈ S .
Is q ∈ S?
Is q ∈ S?
No, q /∈ S .
Filters are so small that they can fit in local memory.
Filters can be used to “filter out” negative membership queries,
improving database performance.
Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if
q ∈ S!
If q /∈ S (false positive), still do an unnecessary access.
Always correct! Don’t need to access database.
5
![Page 26: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/26.jpg)
Common filter usage
• With n log 1/ε
All logs are
base 2 today
local memory, can filter out 1− ε database
accesses to keys q /∈ S .
• Greatly reduces number of remote accesses, thereby reducing
time.
6
![Page 27: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/27.jpg)
Common filter usage
• With n log 1/ε
All logs are
base 2 today
local memory, can filter out 1− ε database
accesses to keys q /∈ S .
• Greatly reduces number of remote accesses, thereby reducing
time.
6
![Page 28: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/28.jpg)
Common filter usage
• With n log 1/ε
All logs are
base 2 today
local memory, can filter out 1− ε database
accesses to keys q /∈ S .
• Greatly reduces number of remote accesses, thereby reducing
time.
6
![Page 29: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/29.jpg)
How Filters Work
![Page 30: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/30.jpg)
Classic: bloom filter
1 0 1 0 1 0 1 0 1
x
h1(x)
h2(x) h3(x)
y
h1(y)h2(y)
h3(y)
We won’t use this today
7
![Page 31: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/31.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 32: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/32.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 33: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/33.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 34: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/34.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 35: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/35.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 36: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/36.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 37: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/37.jpg)
Single-hash filters [PPR 05]
• Choose a random hash function h : U → 0, . . . n/ε− 1
• Store h(x) for all x ∈ S .
• On query q, see if h(q) is stored; if so report that q ∈ S
• No false negatives!
• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S
Assume fully
random h today
• union bound gives us ε of any collision.
8
![Page 38: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/38.jpg)
Space? [PPR 05]
• Store n hashes, each in 0, . . . , n/ε− 1.
• Entropy bound: log2(n/ε
n
)≈ n log 1/ε
• Classic dynamic dictionaries can store data in (close to)
entropy bound
• In a moment we’ll see one practical method to store the
hashes
9
![Page 39: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/39.jpg)
Space? [PPR 05]
• Store n hashes, each in 0, . . . , n/ε− 1.
• Entropy bound: log2(n/ε
n
)≈ n log 1/ε
• Classic dynamic dictionaries can store data in (close to)
entropy bound
• In a moment we’ll see one practical method to store the
hashes
9
![Page 40: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/40.jpg)
Space? [PPR 05]
• Store n hashes, each in 0, . . . , n/ε− 1.
• Entropy bound: log2(n/ε
n
)≈ n log 1/ε
• Classic dynamic dictionaries can store data in (close to)
entropy bound
• In a moment we’ll see one practical method to store the
hashes
9
![Page 41: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/41.jpg)
Space? [PPR 05]
• Store n hashes, each in 0, . . . , n/ε− 1.
• Entropy bound: log2(n/ε
n
)≈ n log 1/ε
• Classic dynamic dictionaries can store data in (close to)
entropy bound
• In a moment we’ll see one practical method to store the
hashes
9
![Page 42: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/42.jpg)
Lower bound [CFGMW 78]
• Store at most εU items out of U; These items must contain
the n items in S
• Therefore, a given memory representation can only be used for(εUn
)sets S ′
• Need enough memory representations to store all(Un
)possible
S ′; therefore need(Un
)/(εUn
)memory representations
space ≥ log2
(U
n
)/(εU
n
)≈ n log2 1/ε
10
![Page 43: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/43.jpg)
Lower bound [CFGMW 78]
• Store at most εU items out of U; These items must contain
the n items in S
• Therefore, a given memory representation can only be used for(εUn
)sets S ′
• Need enough memory representations to store all(Un
)possible
S ′; therefore need(Un
)/(εUn
)memory representations
space ≥ log2
(U
n
)/(εU
n
)≈ n log2 1/ε
10
![Page 44: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/44.jpg)
Lower bound [CFGMW 78]
• Store at most εU items out of U; These items must contain
the n items in S
• Therefore, a given memory representation can only be used for(εUn
)sets S ′
• Need enough memory representations to store all(Un
)possible
S ′; therefore need(Un
)/(εUn
)memory representations
space ≥ log2
(U
n
)/(εU
n
)≈ n log2 1/ε
10
![Page 45: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/45.jpg)
Adaptivity
![Page 46: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/46.jpg)
David Nelson (motivation from bloomier filters [CKR+ 04])
11
![Page 47: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/47.jpg)
Why is fixing false positives important?
• Improve performance by fixing common false positives
• Security implications
• Consistent performance
• xε database accesses in expectation may mean probability ε of
x accesses
12
![Page 48: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/48.jpg)
Why is fixing false positives important?
• Improve performance by fixing common false positives
• Security implications
• Consistent performance
• xε database accesses in expectation may mean probability ε of
x accesses
12
![Page 49: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/49.jpg)
Why is fixing false positives important?
• Improve performance by fixing common false positives
• Security implications
• Consistent performance
• xε database accesses in expectation may mean probability ε of
x accesses
12
![Page 50: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/50.jpg)
Why is fixing false positives important?
• Improve performance by fixing common false positives
• Security implications
• Consistent performance
• xε database accesses in expectation may mean probability ε of
x accesses
12
![Page 51: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/51.jpg)
Adaptivity
• Goal: when a false positive comes, fix it
• Move things around so that it’s no longer a false positive
• In terms of lower bound: on false positive q, want to change
to another memory representation that is correct for all x ∈ S
and also answers q /∈ S .
13
![Page 52: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/52.jpg)
Adaptivity
• Goal: when a false positive comes, fix it
• Move things around so that it’s no longer a false positive
• In terms of lower bound: on false positive q, want to change
to another memory representation that is correct for all x ∈ S
and also answers q /∈ S .
13
![Page 53: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/53.jpg)
Adaptivity
• Goal: when a false positive comes, fix it
• Move things around so that it’s no longer a false positive
• In terms of lower bound: on false positive q, want to change
to another memory representation that is correct for all x ∈ S
and also answers q /∈ S .
13
![Page 54: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/54.jpg)
Adaptivity: More formal definition [BFGJMS 18]
• An adversary generates a sequence of n queries
• When generating query i , adversary knows the result of
queries 1, . . . , i − 1 (knows of each was a false positive)
• Sustained false positive rate: false positive rate of a filter
against such an adversary. Filter is adaptive if it can be tuned
to achieve any sustained false positive rate ε
• Goal: Sustained f.p.r. ε using n log(1/ε) + O(n) bits
14
![Page 55: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/55.jpg)
Adaptivity: More formal definition [BFGJMS 18]
• An adversary generates a sequence of n queries
• When generating query i , adversary knows the result of
queries 1, . . . , i − 1 (knows of each was a false positive)
• Sustained false positive rate: false positive rate of a filter
against such an adversary. Filter is adaptive if it can be tuned
to achieve any sustained false positive rate ε
• Goal: Sustained f.p.r. ε using n log(1/ε) + O(n) bits
14
![Page 56: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/56.jpg)
Adaptivity: More formal definition [BFGJMS 18]
• An adversary generates a sequence of n queries
• When generating query i , adversary knows the result of
queries 1, . . . , i − 1 (knows of each was a false positive)
• Sustained false positive rate: false positive rate of a filter
against such an adversary. Filter is adaptive if it can be tuned
to achieve any sustained false positive rate ε
• Goal: Sustained f.p.r. ε using n log(1/ε) + O(n) bits
14
![Page 57: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/57.jpg)
Adaptivity: More formal definition [BFGJMS 18]
• An adversary generates a sequence of n queries
• When generating query i , adversary knows the result of
queries 1, . . . , i − 1 (knows of each was a false positive)
• Sustained false positive rate: false positive rate of a filter
against such an adversary. Filter is adaptive if it can be tuned
to achieve any sustained false positive rate ε
• Goal: Sustained f.p.r. ε using n log(1/ε) + O(n) bits
14
![Page 58: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/58.jpg)
Assumption (for the moment)
• Let’s assume that we can access S to fix a false positive.
• Accessing S is free anyway: we accessed it due to the false
positive.
• Without this assumption, we risk removing the original x ∈ S
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
15
![Page 59: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/59.jpg)
Assumption (for the moment)
• Let’s assume that we can access S to fix a false positive.
• Accessing S is free anyway: we accessed it due to the false
positive.
• Without this assumption, we risk removing the original x ∈ S
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
15
![Page 60: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/60.jpg)
Assumption (for the moment)
• Let’s assume that we can access S to fix a false positive.
• Accessing S is free anyway: we accessed it due to the false
positive.
• Without this assumption, we risk removing the original x ∈ S
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
15
![Page 61: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/61.jpg)
How can we adapt?
• Change the hash?
• Difficult to get to work. But we’ll come back to this idea.
• Idea: add bits to our hash
• h : U → [n/ε] can be represented using log2 n + log2 1/ε bits.
What if we just add more bits on the end?
16
![Page 62: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/62.jpg)
How can we adapt?
• Change the hash?
• Difficult to get to work. But we’ll come back to this idea.
• Idea: add bits to our hash
• h : U → [n/ε] can be represented using log2 n + log2 1/ε bits.
What if we just add more bits on the end?
16
![Page 63: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/63.jpg)
How can we adapt?
• Change the hash?
• Difficult to get to work. But we’ll come back to this idea.
• Idea: add bits to our hash
• h : U → [n/ε] can be represented using log2 n + log2 1/ε bits.
What if we just add more bits on the end?
16
![Page 64: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/64.jpg)
Adding bits [BFGJMS ’18]
• On a false positive caused by h(q) = h(x), add bits onto thestored value of h(x).
• Idea: let’s say h(x) outputs Θ(log n) bits; we only store a
log(n/ε)-bit prefix
• After O(1) bits in expectation, the stored prefix of h(x) no
longer matches h(q)
• We’ve fixed the false positive!
17
![Page 65: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/65.jpg)
Adding bits [BFGJMS ’18]
• On a false positive caused by h(q) = h(x), add bits onto thestored value of h(x).
• Idea: let’s say h(x) outputs Θ(log n) bits; we only store a
log(n/ε)-bit prefix
• After O(1) bits in expectation, the stored prefix of h(x) no
longer matches h(q)
• We’ve fixed the false positive!
17
![Page 66: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/66.jpg)
Adding bits [BFGJMS ’18]
• On a false positive caused by h(q) = h(x), add bits onto thestored value of h(x).
• Idea: let’s say h(x) outputs Θ(log n) bits; we only store a
log(n/ε)-bit prefix
• After O(1) bits in expectation, the stored prefix of h(x) no
longer matches h(q)
• We’ve fixed the false positive!
17
![Page 67: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/67.jpg)
Space and rebuilds [BFGJMS ’18]
• Fixing a false positive requires O(1) extra bits in expectation
• To keep to n log 1/ε + O(n) space, need to rebuild every Θ(n)
false positives
• Can deamortize over those queries.
• In paper: deamortize in-place
• Result: adaptive filter with no additional cost!
18
![Page 68: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/68.jpg)
Space and rebuilds [BFGJMS ’18]
• Fixing a false positive requires O(1) extra bits in expectation
• To keep to n log 1/ε + O(n) space, need to rebuild every Θ(n)
false positives
• Can deamortize over those queries.
• In paper: deamortize in-place
• Result: adaptive filter with no additional cost!
18
![Page 69: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/69.jpg)
Space and rebuilds [BFGJMS ’18]
• Fixing a false positive requires O(1) extra bits in expectation
• To keep to n log 1/ε + O(n) space, need to rebuild every Θ(n)
false positives
• Can deamortize over those queries.
• In paper: deamortize in-place
• Result: adaptive filter with no additional cost!
18
![Page 70: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/70.jpg)
Space and rebuilds [BFGJMS ’18]
• Fixing a false positive requires O(1) extra bits in expectation
• To keep to n log 1/ε + O(n) space, need to rebuild every Θ(n)
false positives
• Can deamortize over those queries.
• In paper: deamortize in-place
• Result: adaptive filter with no additional cost!
18
![Page 71: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/71.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Adviceq /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 72: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/72.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Adviceq /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 73: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/73.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
q /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 74: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/74.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
q /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 75: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/75.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
q /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 76: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/76.jpg)
Revisiting Advice Assumption
• If we get NO information, can’t fix the query (could have
q ∈ S)
• Is q /∈ S alone enough information to fix the query?
• No: need Ω(n log logU) bits to achieve adaptivity in that
case. [BFG+ 18]
0 1 0 0 1 0 0 1
q
Yes, q ∈ S .
Is q ∈ S?
q /∈ S
Advice
q /∈ S
19
Advice from the database is not
just cheap; it’s necessary to
achieve adaptivity!
![Page 77: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/77.jpg)
Adaptive Cuckoo Filters
![Page 78: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/78.jpg)
Cuckoo filters [FAKM ’14]
• Practical way to create a filter
• Idea: use two log2 2n-bit location hashes h1(x) and h2(x), and
a 1 + log2(1/ε)-bit fingerprint hash f (x).
• Index into a table with 2n entries; each can store
1 + log2(1/ε) bits
• Invariant: for all x ∈ S , f (x) is stored in h1(x) or h2(x).
20
![Page 79: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/79.jpg)
Cuckoo filters [FAKM ’14]
• Practical way to create a filter
• Idea: use two log2 2n-bit location hashes h1(x) and h2(x), and
a 1 + log2(1/ε)-bit fingerprint hash f (x).
• Index into a table with 2n entries; each can store
1 + log2(1/ε) bits
• Invariant: for all x ∈ S , f (x) is stored in h1(x) or h2(x).
20
![Page 80: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/80.jpg)
Cuckoo filters [FAKM ’14]
• Practical way to create a filter
• Idea: use two log2 2n-bit location hashes h1(x) and h2(x), and
a 1 + log2(1/ε)-bit fingerprint hash f (x).
• Index into a table with 2n entries; each can store
1 + log2(1/ε) bits
• Invariant: for all x ∈ S , f (x) is stored in h1(x) or h2(x).
20
![Page 81: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/81.jpg)
Cuckoo filters [FAKM ’14]
• Practical way to create a filter
• Idea: use two log2 2n-bit location hashes h1(x) and h2(x), and
a 1 + log2(1/ε)-bit fingerprint hash f (x).
• Index into a table with 2n entries; each can store
1 + log2(1/ε) bits
• Invariant: for all x ∈ S , f (x) is stored in h1(x) or h2(x).
20
![Page 82: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/82.jpg)
Cuckooing for the cuckoo filter [FAKM ’14]
• To store an item x in the filter, need to store f (x) in h1(x) or
h2(x).
• If one is empty, great; otherwise cuckoo: pick a fingerprint in
h1(x) or h2(x) and move it to its other position.
• For any set S and hashes h1 and h2, there is a legal
configuration with probability 1− O(1/n) [PR ’01]
• In practice: need to do this dynamically without access to S ;
need to bound update time
21
![Page 83: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/83.jpg)
Cuckooing for the cuckoo filter [FAKM ’14]
• To store an item x in the filter, need to store f (x) in h1(x) or
h2(x).
• If one is empty, great; otherwise cuckoo: pick a fingerprint in
h1(x) or h2(x) and move it to its other position.
• For any set S and hashes h1 and h2, there is a legal
configuration with probability 1− O(1/n) [PR ’01]
• In practice: need to do this dynamically without access to S ;
need to bound update time
21
![Page 84: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/84.jpg)
Cuckooing for the cuckoo filter [FAKM ’14]
• To store an item x in the filter, need to store f (x) in h1(x) or
h2(x).
• If one is empty, great; otherwise cuckoo: pick a fingerprint in
h1(x) or h2(x) and move it to its other position.
• For any set S and hashes h1 and h2, there is a legal
configuration with probability 1− O(1/n) [PR ’01]
• In practice: need to do this dynamically without access to S ;
need to bound update time
21
![Page 85: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/85.jpg)
Cuckooing for the cuckoo filter [FAKM ’14]
• To store an item x in the filter, need to store f (x) in h1(x) or
h2(x).
• If one is empty, great; otherwise cuckoo: pick a fingerprint in
h1(x) or h2(x) and move it to its other position.
• For any set S and hashes h1 and h2, there is a legal
configuration with probability 1− O(1/n) [PR ’01]
• In practice: need to do this dynamically without access to S ;
need to bound update time
21
![Page 86: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/86.jpg)
Cuckoo filter [FAKM ’14]
101 010 110 001
0 1 2 3 4 5
xf (x) = 010
h1(x) h2(x)
22
![Page 87: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/87.jpg)
Cuckoo filter [FAKM ’14]
101 000 111 110 001
0 1 2 3 4 5
xf (x) = 010
h1(x) h2(x)
23
![Page 88: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/88.jpg)
Cuckoo filter [FAKM ’14]
101 000 111 110 001
0 1 2 3 4 5
xf (x) = 010
h1(x) h2(x)
23
![Page 89: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/89.jpg)
Cuckoo filters
• Query q: look for f (q) in h1(q) and h2(q).
• No false negatives!
• False positive: the probability that f (q) = f (y) for the y
stored in h1(q) is 21+log 1/ε = ε/2
• Union bound gives probability ≤ ε
• Space: 2n log2 1/ε + 2n bits
• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo
hashing tricks
24
![Page 90: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/90.jpg)
Cuckoo filters
• Query q: look for f (q) in h1(q) and h2(q).
• No false negatives!
• False positive: the probability that f (q) = f (y) for the y
stored in h1(q) is 21+log 1/ε = ε/2
• Union bound gives probability ≤ ε
• Space: 2n log2 1/ε + 2n bits
• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo
hashing tricks
24
![Page 91: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/91.jpg)
Cuckoo filters
• Query q: look for f (q) in h1(q) and h2(q).
• No false negatives!
• False positive: the probability that f (q) = f (y) for the y
stored in h1(q) is 21+log 1/ε = ε/2
• Union bound gives probability ≤ ε
• Space: 2n log2 1/ε + 2n bits
• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo
hashing tricks
24
![Page 92: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/92.jpg)
Cuckoo filters
• Query q: look for f (q) in h1(q) and h2(q).
• No false negatives!
• False positive: the probability that f (q) = f (y) for the y
stored in h1(q) is 21+log 1/ε = ε/2
• Union bound gives probability ≤ ε
• Space: 2n log2 1/ε + 2n bits
• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo
hashing tricks
24
![Page 93: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/93.jpg)
Cuckoo filters
• Query q: look for f (q) in h1(q) and h2(q).
• No false negatives!
• False positive: the probability that f (q) = f (y) for the y
stored in h1(q) is 21+log 1/ε = ε/2
• Union bound gives probability ≤ ε
• Space: 2n log2 1/ε + 2n bits
• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo
hashing tricks
24
![Page 94: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/94.jpg)
How to Adapt with a Cuckoo Filter
![Page 95: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/95.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
• One way to adapt with a filter
• Idea: use s hash selector bits per slot
• Rather than one fingerprint function f , have 2s fingerprint
functions f1, . . . f2s .
• On a false positive, change the fingerprint function of the
stored element to the next fingerprint; update the hash
selector bits accordingly
25
![Page 96: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/96.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
• One way to adapt with a filter
• Idea: use s hash selector bits per slot
• Rather than one fingerprint function f , have 2s fingerprint
functions f1, . . . f2s .
• On a false positive, change the fingerprint function of the
stored element to the next fingerprint; update the hash
selector bits accordingly
25
![Page 97: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/97.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
• One way to adapt with a filter
• Idea: use s hash selector bits per slot
• Rather than one fingerprint function f , have 2s fingerprint
functions f1, . . . f2s .
• On a false positive, change the fingerprint function of the
stored element to the next fingerprint; update the hash
selector bits accordingly
25
![Page 98: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/98.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
• One way to adapt with a filter
• Idea: use s hash selector bits per slot
• Rather than one fingerprint function f , have 2s fingerprint
functions f1, . . . f2s .
• On a false positive, change the fingerprint function of the
stored element to the next fingerprint; update the hash
selector bits accordingly
25
![Page 99: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/99.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
101 010 110 001
0 0 1 0 0 1
0 1 2 3 4 5
q
f1(q) = 010
h1(q)
• Cycles between 2s
fingerprint functions
• s bits/slot to keep track of
which is used
• On false positive: go to
next fingerprint function,
update fingerprint of stored
element.
• Requires access to S
26
![Page 100: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/100.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
101 010 110 001
0 0 1 0 0 1
0 1 2 3 4 5
qf1(q) = 010
h1(q)
• Cycles between 2s
fingerprint functions
• s bits/slot to keep track of
which is used
• On false positive: go to
next fingerprint function,
update fingerprint of stored
element.
• Requires access to S
26
![Page 101: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/101.jpg)
Adaptive cuckoo filter [MPR ’18, ’20]
101 111 110 001
0 0 0 0 0 1
0 1 2 3 4 5
qf1(q) = 010
h1(q)
h1(x) = h1(q)
f1(x) = 010, f2(x) = 111
• Cycles between 2s
fingerprint functions
• s bits/slot to keep track of
which is used
• On false positive: go to
next fingerprint function,
update fingerprint of stored
element.
• Requires access to S
27
![Page 102: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/102.jpg)
Adaptive cuckoo filter performance
101 010 110 001
0 0 1 0 0 1
0 1 2 3 4 5
qf1(q) = 010
h1(q)
• Simple, effective in practice
28
![Page 103: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/103.jpg)
Adaptive cuckoo filter performance
Plot from [MPR ’18]; performance
on network trace data. x-axis is #
queries/n; y axis is false positives.
(Lower is better.)
• Simple, effective in
practice
29
![Page 104: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/104.jpg)
Adaptive cuckoo filter performance
101 010 110 001
0 0 1 0 0 1
0 1 2 3 4 5
qf1(q) = 010
h1(q)
• Simple, effective in practice
• But not adaptive!
• Idea: among 1/ε2s
queries,
one will collide with some y
on all fingerprint functions
30
![Page 105: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/105.jpg)
Adaptive cuckoo filter performance
101 010 110 001
0 0 1 0 0 1
0 1 2 3 4 5
qf1(q) = 010
h1(q)
• Simple, effective in practice
• But not adaptive!
• Idea: among 1/ε2s
queries,
one will collide with some y
on all fingerprint functions
30
![Page 106: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/106.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
101 010 110 001 000
0 1 2 3 4 5 6 7
qf (q) = 110
h1(q) h2(q)
31
![Page 107: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/107.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
101 010 110 001 000
0 1 2 3 4 5 6 7
qf (q) = 110
h1(q) h2(q)
31
![Page 108: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/108.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
101 010 110 000 001
0 1 2 3 4 5 6 7
qf (q) = 110
h1(q) h2(q)
31
![Page 109: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/109.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
• Recall access to S
• Get log n “bits of adaptivity” per attempt rather than log 1/ε
• “Support optimal”: ε(1 + o(1)) false positives per unique
query
• Is this adaptive?
• No—birthday attack!
• After√n/ε2 queries can find q1 and q2 that collide with h1(y)
and h2(y) for some y ∈ S .
32
![Page 110: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/110.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
• Recall access to S
• Get log n “bits of adaptivity” per attempt rather than log 1/ε
• “Support optimal”: ε(1 + o(1)) false positives per unique
query
• Is this adaptive?
• No—birthday attack!
• After√n/ε2 queries can find q1 and q2 that collide with h1(y)
and h2(y) for some y ∈ S .
32
![Page 111: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/111.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
• Recall access to S
• Get log n “bits of adaptivity” per attempt rather than log 1/ε
• “Support optimal”: ε(1 + o(1)) false positives per unique
query
• Is this adaptive?
• No—birthday attack!
• After√n/ε2 queries can find q1 and q2 that collide with h1(y)
and h2(y) for some y ∈ S .
32
![Page 112: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/112.jpg)
Another way to adapt [KMP ’21]
• When you have a collision, move the item to its other slot
• Recall access to S
• Get log n “bits of adaptivity” per attempt rather than log 1/ε
• “Support optimal”: ε(1 + o(1)) false positives per unique
query
• Is this adaptive?
• No—birthday attack!
• After√n/ε2 queries can find q1 and q2 that collide with h1(y)
and h2(y) for some y ∈ S .
32
![Page 113: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/113.jpg)
Cuckooing for adaptivity
• Benefit: no need to store any extra bits; can augment a
normal cuckoo filter with a simple “adapt” function
• Downside: Cache misses on adapts
• Inserts are fairly expensive in cuckoo hashing (lookups are
cheap)
33
![Page 114: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/114.jpg)
Less metadata means better performance!
0.001
0.01
0.1
0 10 20 30 40 50 60 70 80 90 100
Fals
e P
osi
tive r
ate
A/S ratio
Chicago A, f=8
Cuckoo FilterCuckooing ACFSwapping ACF
Cyclic ACF s=1Cyclic ACF s=2Cyclic ACF s=3
34
![Page 115: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/115.jpg)
What do we need for adaptivity?
Lemma 1 (KMP ’21)
If a filter can only use O(1) configurations for a given element it is
not adaptive.
• Proof is basically a generalization of the birthday attack
• Adaptive filter of [BFG+ 18] does not satisfy this (can use
more bits for certain elements; up to O(log n))
• Takeway: necessary to allocate resources unevenly to achieve
adaptivity
35
![Page 116: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/116.jpg)
What do we need for adaptivity?
Lemma 1 (KMP ’21)
If a filter can only use O(1) configurations for a given element it is
not adaptive.
• Proof is basically a generalization of the birthday attack
• Adaptive filter of [BFG+ 18] does not satisfy this (can use
more bits for certain elements; up to O(log n))
• Takeway: necessary to allocate resources unevenly to achieve
adaptivity
35
![Page 117: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/117.jpg)
What do we need for adaptivity?
Lemma 1 (KMP ’21)
If a filter can only use O(1) configurations for a given element it is
not adaptive.
• Proof is basically a generalization of the birthday attack
• Adaptive filter of [BFG+ 18] does not satisfy this (can use
more bits for certain elements; up to O(log n))
• Takeway: necessary to allocate resources unevenly to achieve
adaptivity
35
![Page 118: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/118.jpg)
Telescoping adaptive filter [LMSS ’21]
• Adaptive Cuckoo Filter with variable number of bits per
element (implies actual adaptivity!)
• Uses arithmetic coding—can use a fractional number of bits
per element
• Competitive with other filters in practice; reasonable
throughput
• Cuckooing adaptive filter is still best when adapting is easy
36
![Page 119: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/119.jpg)
Telescoping adaptive filter [LMSS ’21]
• Adaptive Cuckoo Filter with variable number of bits per
element (implies actual adaptivity!)
• Uses arithmetic coding—can use a fractional number of bits
per element
• Competitive with other filters in practice; reasonable
throughput
• Cuckooing adaptive filter is still best when adapting is easy
36
![Page 120: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/120.jpg)
Telescoping adaptive filter [LMSS ’21]
• Adaptive Cuckoo Filter with variable number of bits per
element (implies actual adaptivity!)
• Uses arithmetic coding—can use a fractional number of bits
per element
• Competitive with other filters in practice; reasonable
throughput
• Cuckooing adaptive filter is still best when adapting is easy
36
![Page 121: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/121.jpg)
Telescoping adaptive filter [LMSS ’21]
• Adaptive Cuckoo Filter with variable number of bits per
element (implies actual adaptivity!)
• Uses arithmetic coding—can use a fractional number of bits
per element
• Competitive with other filters in practice; reasonable
throughput
• Cuckooing adaptive filter is still best when adapting is easy
36
![Page 122: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/122.jpg)
Conclusion
• Adaptivity improves filter performance
• Simple methods can be effective in practice without achieving
worst-case adaptivity
37
![Page 123: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/123.jpg)
Conclusion
• Adaptivity improves filter performance
• Simple methods can be effective in practice without achieving
worst-case adaptivity
37
![Page 124: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/124.jpg)
Open problems
• What can we do without advice?
• Need to limit adversary
• Might be possible to “hide” false positives rather than fixing
them
• Simple, practical way to fix false positives short term without
overhead
38
![Page 125: Theory and Practice of Adaptive Filters](https://reader034.fdocuments.us/reader034/viewer/2022042408/625e7db02ad194555759e692/html5/thumbnails/125.jpg)
Thanks!