Download - Theory and Practice of Adaptive Filters

Theory and Practice of Adaptive Filters

Sam McCauley

September 21, 2021

Williams College

Filters

• Filter: succinct data structure used to speed up database

queries

• Example: Bloom Filter

• Let’s say our database stores a set of keys S .

1

Filters

queries

1

Filters

queries

1

Page 5: Theory and Practice of Adaptive Filters

Common filter usage

qLookup(q)

Database queries are very expensive!

2

Page 6: Theory and Practice of Adaptive Filters

Database queries are often “unnecessary”

Many workloads involve mostly “negative” queries: queries to keys

not stored in the database. (query q /∈ S)

• Classic example: dictionary of unusually-hyphenated words for

a spellchecker.

• Checking if key already exists before an insert (deduplication

in general)

• Check for malicious URLs

• Table with many empty entries

Classic filter usage: succinct data structure that will allow us to

“filter out” negative queries.

3

Page 7: Theory and Practice of Adaptive Filters

a spellchecker.

in general)

3

Page 8: Theory and Practice of Adaptive Filters

a spellchecker.

in general)

3

Page 9: Theory and Practice of Adaptive Filters

a spellchecker.

in general)

3

Filters

• Store compact

representation of a set S

• Answer membership

queries “is q ∈ S?”

• Lossy compression: can

make mistakes

• No false negatives: q /∈ S

is always correct.

• False positive with

probability ε.

• Space: n log2 1/ε + O(n)

bits.

4

Filters

• Store compact

make mistakes

is always correct.

probability ε.

bits.

4

Filters

• Store compact

make mistakes

is always correct.

probability ε.

bits.

4

Filters

• Store compact

make mistakes

is always correct.

probability ε.

bits.

4

Filters

• Store compact

make mistakes

is always correct.

probability ε.

bits.

4

Filters

• Store compact

make mistakes

is always correct.

probability ε.

bits.

4

Page 16: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?Yes, q ∈ S .

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

Filters are so small that they can fit in local memory.

Filters can be used to “filter out” negative membership queries,

improving database performance.

Fast in-memory query.If filter reports q ∈ S , access the database. We do want this if

q ∈ S!

If q /∈ S (false positive), still do an unnecessary access.

Always correct! Don’t need to access database.

5

Page 17: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

q ∈ S!

5

Page 18: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

q ∈ S!

5

Page 19: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Yes, q ∈ S .

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

q ∈ S!

5

Page 20: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Yes, q ∈ S .

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

Fast in-memory query.

If filter reports q ∈ S , access the database. We do want this if

q ∈ S!

5

Page 21: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Yes, q ∈ S .

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

q ∈ S!

5

Page 22: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

Yes, q ∈ S .

Is q ∈ S?

Is q ∈ S?No, q /∈ S .

Fast in-memory query.

If filter reports q ∈ S , access the database. We do want this if

q ∈ S!

5

Page 23: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

No, q /∈ S .

q ∈ S!

5

Page 24: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

No, q /∈ S .

q ∈ S!

5

Page 25: Theory and Practice of Adaptive Filters

Common filter usage

0 1 0 0 1 0 0 1

q

Is q ∈ S?

No, q /∈ S .

q ∈ S!

5

Page 26: Theory and Practice of Adaptive Filters

Common filter usage

• With n log 1/ε

All logs are

base 2 today

local memory, can filter out 1− ε database

accesses to keys q /∈ S .

• Greatly reduces number of remote accesses, thereby reducing

time.

6

Page 27: Theory and Practice of Adaptive Filters

Common filter usage

• With n log 1/ε

All logs are

base 2 today

time.

6

Page 28: Theory and Practice of Adaptive Filters

Common filter usage

• With n log 1/ε

All logs are

base 2 today

time.

6

Page 29: Theory and Practice of Adaptive Filters

How Filters Work

Page 30: Theory and Practice of Adaptive Filters

Classic: bloom filter

1 0 1 0 1 0 1 0 1

x

h1(x)

h2(x) h3(x)

y

h1(y)h2(y)

h3(y)

We won’t use this today

7

Page 31: Theory and Practice of Adaptive Filters

Single-hash filters [PPR 05]

• Choose a random hash function h : U → 0, . . . n/ε− 1

• Store h(x) for all x ∈ S .

• On query q, see if h(q) is stored; if so report that q ∈ S

• No false negatives!

• False positive probability: probability ε/n that h(q) = h(x) fora single x ∈ S

Assume fully

random h today

• union bound gives us ε of any collision.

8

Page 32: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 33: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 34: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 35: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 36: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 37: Theory and Practice of Adaptive Filters

Assume fully

random h today

8

Page 38: Theory and Practice of Adaptive Filters

Space? [PPR 05]

• Store n hashes, each in 0, . . . , n/ε− 1.

• Entropy bound: log2(n/ε

n

)≈ n log 1/ε

• Classic dynamic dictionaries can store data in (close to)

entropy bound

• In a moment we’ll see one practical method to store the

hashes

9

Page 39: Theory and Practice of Adaptive Filters

Space? [PPR 05]

n

)≈ n log 1/ε

entropy bound

hashes

9

Page 40: Theory and Practice of Adaptive Filters

Space? [PPR 05]

n

)≈ n log 1/ε

entropy bound

hashes

9

Page 41: Theory and Practice of Adaptive Filters

Space? [PPR 05]

n

)≈ n log 1/ε

entropy bound

hashes

9

Page 42: Theory and Practice of Adaptive Filters

Lower bound [CFGMW 78]

• Store at most εU items out of U; These items must contain

the n items in S

• Therefore, a given memory representation can only be used for(εUn

)sets S ′

• Need enough memory representations to store all(Un

)possible

S ′; therefore need(Un

)/(εUn

)memory representations

space ≥ log2

(U

n

)/(εU

n

)≈ n log2 1/ε

10

Page 43: Theory and Practice of Adaptive Filters

the n items in S

)sets S ′

)possible

)/(εUn

space ≥ log2

(U

n

)/(εU

n

)≈ n log2 1/ε

10

Page 44: Theory and Practice of Adaptive Filters

the n items in S

)sets S ′

)possible

)/(εUn

space ≥ log2

(U

n

)/(εU

n

)≈ n log2 1/ε

10

Page 45: Theory and Practice of Adaptive Filters

Adaptivity

Page 46: Theory and Practice of Adaptive Filters

David Nelson (motivation from bloomier filters [CKR+ 04])

11

Page 47: Theory and Practice of Adaptive Filters

Why is fixing false positives important?

• Improve performance by fixing common false positives

• Security implications

• Consistent performance

• xε database accesses in expectation may mean probability ε of

x accesses

12

Page 48: Theory and Practice of Adaptive Filters

x accesses

12

Page 49: Theory and Practice of Adaptive Filters

x accesses

12

Page 50: Theory and Practice of Adaptive Filters

x accesses

12

Page 51: Theory and Practice of Adaptive Filters

Adaptivity

• Goal: when a false positive comes, fix it

• Move things around so that it’s no longer a false positive

• In terms of lower bound: on false positive q, want to change

to another memory representation that is correct for all x ∈ S

and also answers q /∈ S .

13

Page 52: Theory and Practice of Adaptive Filters

Adaptivity

13

Page 53: Theory and Practice of Adaptive Filters

Adaptivity

13

Page 54: Theory and Practice of Adaptive Filters

Adaptivity: More formal definition [BFGJMS 18]

• An adversary generates a sequence of n queries

• When generating query i , adversary knows the result of

queries 1, . . . , i − 1 (knows of each was a false positive)

• Sustained false positive rate: false positive rate of a filter

against such an adversary. Filter is adaptive if it can be tuned

to achieve any sustained false positive rate ε

• Goal: Sustained f.p.r. ε using n log(1/ε) + O(n) bits

14

Page 55: Theory and Practice of Adaptive Filters

14

Page 56: Theory and Practice of Adaptive Filters

14

Page 57: Theory and Practice of Adaptive Filters

14

Page 58: Theory and Practice of Adaptive Filters

Assumption (for the moment)

• Let’s assume that we can access S to fix a false positive.

• Accessing S is free anyway: we accessed it due to the false

positive.

• Without this assumption, we risk removing the original x ∈ S

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

15

Page 59: Theory and Practice of Adaptive Filters

positive.

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

15

Page 60: Theory and Practice of Adaptive Filters

positive.

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

15

Page 61: Theory and Practice of Adaptive Filters

How can we adapt?

• Change the hash?

• Difficult to get to work. But we’ll come back to this idea.

• Idea: add bits to our hash

• h : U → [n/ε] can be represented using log2 n + log2 1/ε bits.

What if we just add more bits on the end?

16

Page 62: Theory and Practice of Adaptive Filters

How can we adapt?

16

Page 63: Theory and Practice of Adaptive Filters

How can we adapt?

16

Page 64: Theory and Practice of Adaptive Filters

Adding bits [BFGJMS ’18]

• On a false positive caused by h(q) = h(x), add bits onto thestored value of h(x).

• Idea: let’s say h(x) outputs Θ(log n) bits; we only store a

log(n/ε)-bit prefix

• After O(1) bits in expectation, the stored prefix of h(x) no

longer matches h(q)

• We’ve fixed the false positive!

17

Page 65: Theory and Practice of Adaptive Filters

longer matches h(q)

17

Page 66: Theory and Practice of Adaptive Filters

longer matches h(q)

17

Page 67: Theory and Practice of Adaptive Filters

Space and rebuilds [BFGJMS ’18]

• Fixing a false positive requires O(1) extra bits in expectation

• To keep to n log 1/ε + O(n) space, need to rebuild every Θ(n)

false positives

• Can deamortize over those queries.

• In paper: deamortize in-place

• Result: adaptive filter with no additional cost!

18

Page 68: Theory and Practice of Adaptive Filters

false positives

18

Page 69: Theory and Practice of Adaptive Filters

false positives

18

Page 70: Theory and Practice of Adaptive Filters

false positives

18

Page 71: Theory and Practice of Adaptive Filters

Revisiting Advice Assumption

• If we get NO information, can’t fix the query (could have

q ∈ S)

• Is q /∈ S alone enough information to fix the query?

• No: need Ω(n log logU) bits to achieve adaptivity in that

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Adviceq /∈ S

19

Advice from the database is not

just cheap; it’s necessary to

achieve adaptivity!

Page 72: Theory and Practice of Adaptive Filters

q ∈ S)

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Adviceq /∈ S

19

achieve adaptivity!

Page 73: Theory and Practice of Adaptive Filters

q ∈ S)

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

q /∈ S

19

achieve adaptivity!

Page 74: Theory and Practice of Adaptive Filters

q ∈ S)

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

q /∈ S

19

achieve adaptivity!

Page 75: Theory and Practice of Adaptive Filters

q ∈ S)

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

q /∈ S

19

achieve adaptivity!

Page 76: Theory and Practice of Adaptive Filters

q ∈ S)

case. [BFG+ 18]

0 1 0 0 1 0 0 1

q

Yes, q ∈ S .

Is q ∈ S?

q /∈ S

Advice

q /∈ S

19

achieve adaptivity!

Page 77: Theory and Practice of Adaptive Filters

Adaptive Cuckoo Filters

Page 78: Theory and Practice of Adaptive Filters

Cuckoo filters [FAKM ’14]

• Practical way to create a filter

• Idea: use two log2 2n-bit location hashes h1(x) and h2(x), and

a 1 + log2(1/ε)-bit fingerprint hash f (x).

• Index into a table with 2n entries; each can store

1 + log2(1/ε) bits

• Invariant: for all x ∈ S , f (x) is stored in h1(x) or h2(x).

20

Page 79: Theory and Practice of Adaptive Filters

1 + log2(1/ε) bits

20

Page 80: Theory and Practice of Adaptive Filters

1 + log2(1/ε) bits

20

Page 81: Theory and Practice of Adaptive Filters

1 + log2(1/ε) bits

20

Page 82: Theory and Practice of Adaptive Filters

Cuckooing for the cuckoo filter [FAKM ’14]

• To store an item x in the filter, need to store f (x) in h1(x) or

h2(x).

• If one is empty, great; otherwise cuckoo: pick a fingerprint in

h1(x) or h2(x) and move it to its other position.

• For any set S and hashes h1 and h2, there is a legal

configuration with probability 1− O(1/n) [PR ’01]

• In practice: need to do this dynamically without access to S ;

need to bound update time

21

Page 83: Theory and Practice of Adaptive Filters

h2(x).

21

Page 84: Theory and Practice of Adaptive Filters

h2(x).

21

Page 85: Theory and Practice of Adaptive Filters

h2(x).

21

Page 86: Theory and Practice of Adaptive Filters

Cuckoo filter [FAKM ’14]

101 010 110 001

0 1 2 3 4 5

xf (x) = 010

h1(x) h2(x)

22

Page 87: Theory and Practice of Adaptive Filters

101 000 111 110 001

0 1 2 3 4 5

xf (x) = 010

h1(x) h2(x)

23

Page 88: Theory and Practice of Adaptive Filters

101 000 111 110 001

0 1 2 3 4 5

xf (x) = 010

h1(x) h2(x)

23

Page 89: Theory and Practice of Adaptive Filters

Cuckoo filters

• Query q: look for f (q) in h1(q) and h2(q).

• False positive: the probability that f (q) = f (y) for the y

stored in h1(q) is 21+log 1/ε = ε/2

• Union bound gives probability ≤ ε

• Space: 2n log2 1/ε + 2n bits

• Can improve to 1.05 log2 1/ε + 2.10n bits using classic cuckoo

hashing tricks

24

Page 90: Theory and Practice of Adaptive Filters

Cuckoo filters

hashing tricks

24

Page 91: Theory and Practice of Adaptive Filters

Cuckoo filters

hashing tricks

24

Page 92: Theory and Practice of Adaptive Filters

Cuckoo filters

hashing tricks

24

Page 93: Theory and Practice of Adaptive Filters

Cuckoo filters

hashing tricks

24

Page 94: Theory and Practice of Adaptive Filters

How to Adapt with a Cuckoo Filter

Page 95: Theory and Practice of Adaptive Filters

Adaptive cuckoo filter [MPR ’18, ’20]

• One way to adapt with a filter

• Idea: use s hash selector bits per slot

• Rather than one fingerprint function f , have 2s fingerprint

functions f1, . . . f2s .

• On a false positive, change the fingerprint function of the

stored element to the next fingerprint; update the hash

selector bits accordingly

25

Page 96: Theory and Practice of Adaptive Filters

25

Page 97: Theory and Practice of Adaptive Filters

25

Page 98: Theory and Practice of Adaptive Filters

25

Page 99: Theory and Practice of Adaptive Filters

101 010 110 001

0 0 1 0 0 1

0 1 2 3 4 5

q

f1(q) = 010

h1(q)

• Cycles between 2s

fingerprint functions

• s bits/slot to keep track of

which is used

• On false positive: go to

next fingerprint function,

update fingerprint of stored

element.

• Requires access to S

26

Page 100: Theory and Practice of Adaptive Filters

101 010 110 001

0 0 1 0 0 1

0 1 2 3 4 5

qf1(q) = 010

h1(q)

which is used

element.

26

Page 101: Theory and Practice of Adaptive Filters

101 111 110 001

0 0 0 0 0 1

0 1 2 3 4 5

qf1(q) = 010

h1(q)

h1(x) = h1(q)

f1(x) = 010, f2(x) = 111

which is used

element.

27

Page 102: Theory and Practice of Adaptive Filters

Adaptive cuckoo filter performance

101 010 110 001

0 0 1 0 0 1

0 1 2 3 4 5

qf1(q) = 010

h1(q)

• Simple, effective in practice

28

Page 103: Theory and Practice of Adaptive Filters

Plot from [MPR ’18]; performance

on network trace data. x-axis is #

queries/n; y axis is false positives.

(Lower is better.)

• Simple, effective in

practice

29

101 010 110 001

0 0 1 0 0 1

0 1 2 3 4 5

qf1(q) = 010

h1(q)

• But not adaptive!

• Idea: among 1/ε2s

queries,

one will collide with some y

on all fingerprint functions

30

Page 105: Theory and Practice of Adaptive Filters

101 010 110 001

0 0 1 0 0 1

0 1 2 3 4 5

qf1(q) = 010

h1(q)

• But not adaptive!

• Idea: among 1/ε2s

queries,

one will collide with some y

on all fingerprint functions

30

Page 106: Theory and Practice of Adaptive Filters

Another way to adapt [KMP ’21]

• When you have a collision, move the item to its other slot

101 010 110 001 000

0 1 2 3 4 5 6 7

qf (q) = 110

h1(q) h2(q)

31

Page 107: Theory and Practice of Adaptive Filters

101 010 110 001 000

0 1 2 3 4 5 6 7

qf (q) = 110

h1(q) h2(q)

31

Page 108: Theory and Practice of Adaptive Filters

101 010 110 000 001

0 1 2 3 4 5 6 7

qf (q) = 110

h1(q) h2(q)

31

Page 109: Theory and Practice of Adaptive Filters

• Recall access to S

• Get log n “bits of adaptivity” per attempt rather than log 1/ε

• “Support optimal”: ε(1 + o(1)) false positives per unique

query

• Is this adaptive?

• No—birthday attack!

• After√n/ε2 queries can find q1 and q2 that collide with h1(y)

and h2(y) for some y ∈ S .

32

Page 110: Theory and Practice of Adaptive Filters

query

32

Page 111: Theory and Practice of Adaptive Filters

query

32

Page 112: Theory and Practice of Adaptive Filters

query

32

Page 113: Theory and Practice of Adaptive Filters

Cuckooing for adaptivity

• Benefit: no need to store any extra bits; can augment a

normal cuckoo filter with a simple “adapt” function

• Downside: Cache misses on adapts

• Inserts are fairly expensive in cuckoo hashing (lookups are

cheap)

33

Page 114: Theory and Practice of Adaptive Filters

Less metadata means better performance!

0.001

0.01

0.1

0 10 20 30 40 50 60 70 80 90 100

Fals

e P

osi

tive r

ate

A/S ratio

Chicago A, f=8

Cuckoo FilterCuckooing ACFSwapping ACF

Cyclic ACF s=1Cyclic ACF s=2Cyclic ACF s=3

34

Page 115: Theory and Practice of Adaptive Filters

What do we need for adaptivity?

Lemma 1 (KMP ’21)

If a filter can only use O(1) configurations for a given element it is

not adaptive.

• Proof is basically a generalization of the birthday attack

• Adaptive filter of [BFG+ 18] does not satisfy this (can use

more bits for certain elements; up to O(log n))

• Takeway: necessary to allocate resources unevenly to achieve

adaptivity

35

Page 116: Theory and Practice of Adaptive Filters

Lemma 1 (KMP ’21)

not adaptive.

adaptivity

35

Page 117: Theory and Practice of Adaptive Filters

Lemma 1 (KMP ’21)

not adaptive.

adaptivity

35

Page 118: Theory and Practice of Adaptive Filters

Telescoping adaptive filter [LMSS ’21]

• Adaptive Cuckoo Filter with variable number of bits per

element (implies actual adaptivity!)

• Uses arithmetic coding—can use a fractional number of bits

per element

• Competitive with other filters in practice; reasonable

throughput

• Cuckooing adaptive filter is still best when adapting is easy

36

Page 119: Theory and Practice of Adaptive Filters

per element

throughput

36

Page 120: Theory and Practice of Adaptive Filters

per element

throughput

36

Page 121: Theory and Practice of Adaptive Filters

per element

throughput

36

Page 122: Theory and Practice of Adaptive Filters

Conclusion

• Adaptivity improves filter performance

• Simple methods can be effective in practice without achieving

worst-case adaptivity

37

Page 123: Theory and Practice of Adaptive Filters

Conclusion

• Adaptivity improves filter performance

• Simple methods can be effective in practice without achieving

worst-case adaptivity

37

Page 124: Theory and Practice of Adaptive Filters

Open problems

• What can we do without advice?

• Need to limit adversary

• Might be possible to “hide” false positives rather than fixing

them

• Simple, practical way to fix false positives short term without

overhead

38

Page 125: Theory and Practice of Adaptive Filters

Thanks!