The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion,...
-
Upload
steven-johns -
Category
Documents
-
view
216 -
download
0
Transcript of The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion,...
![Page 1: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/1.jpg)
The Bloom Paradox
Ori Rottenstreich
Joint work with
Yossi Kanizo and Isaac Keslassy
Technion, Israel
![Page 2: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/2.jpg)
• Requirement: A data structure in user with fast answer to• Solutions:
o O(n) – Searching in a listo O(log(n)) – Searching in a sorted listo O(1) – But with false positives / negatives
Slocal cache
Problem Definition
2
Mcentral memory with
all elements
vuzyxzx
x
usercost = 10
cost = 1x
y
cost = 10
y
user
y
![Page 3: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/3.jpg)
• False Positive: but the data structure answers
• Results in a redundant access to the local cache.
Additional cost of 1.
• False Negative: but the data structure answers
• Results in an expensive access to the central memory instead of the local cache.
Additional cost of 10-1=9.
Two Possible Errors
3
x
y
![Page 4: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/4.jpg)
1
• Initialization: Array of zero bits.
• Insertion: Each of the elements is hashed times, the corresponding bits are set.
• Query: Hashing the element, checking that all bits are set.
• False positive rate (probability) of • No false negatives
Bloom Filters (Bloom, 1970)
4
0000000000 00
1
y1 1
0000000000 00
1 1
z
x11
1 1
1 11 1 1
x11 1 w
1 11
![Page 5: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/5.jpg)
• Cache/Memory Framework• Packet Classification• Intrusion Detection• Routing• Accounting• Beyond networking: Spell Checking, DNA Classification
• Can be found in o Google's web browser Chromeo Google's database system BigTableo Facebook's distributed storage system Cassandrao Mellanox's IB Switch System
Bloom Filters are Widely Used
5
![Page 6: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/6.jpg)
Outline
Introduction to Bloom Filters
The Bloom Paradox
The Variable-Increment Counting Bloom Filter
6
![Page 7: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/7.jpg)
The Bloom Paradox
7
Sometimes, it is better to disregard the Bloom filter results, and in fact not to even query it,
thus making the Bloom filter useless.
![Page 8: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/8.jpg)
• Parameters:
• Extreme case without locality: All elements with equal probability of
belonging to the cache.o Toy example
Example
8
Bloom filter
![Page 9: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/9.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives →
• Intuition:
Slocal cache
Mcentral memory with
all elements
vuzyxzx
cost = 10cost = 1
cost = 10
The Bloom Paradox
. .
userBBloom filterBloom filter
9
![Page 10: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/10.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives →
• Surprise:
cost = 1
Slocal cache
Mcentral memory with
all elements
vuzyxzx
cost = 10
cost = 10
The Bloom Paradox
. . 9
BBloom filter
![Page 11: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/11.jpg)
• Parameters:• Let be the set of elements that the Bloom filter indicates are in
o In particular, no false negatives →
• Surprise:
The Bloom filter indicates the membership of
elements. Only of them are indeed in .
The Bloom Paradox
. .
BBloom filter
![Page 12: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/12.jpg)
• When the Bloom filter states that , it is wrong with probability
• Average cost if we listen to the Bloom filter:
• Average cost if we don’t:
The Bloom filter is useless!
The Bloom Paradox
11
Don’t listen to the Bloom filter
= =
![Page 13: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/13.jpg)
Outline
Introduction to Bloom Filters
The Bloom Paradox
The Variable-Increment Counting Bloom Filter
12
![Page 14: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/14.jpg)
1
• Bloom filters do not support deletions of elements. Simply resetting bits might cause false negatives.
• The solution: Counting Bloom filters - Storing array of counters instead of bits.o Insertion: Incrementing counters by one.o Deletion: Decrementing counters by one. o Query: Checking that counters are positive.
• The same false positive probability.• Require too much memory, e.g. 57 bits per element for .
Counting Bloom Filters (CBFs)
y+1 +1
0102001010 01
+1 +1x
+1+1
0000001010 00
x11 111
![Page 15: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/15.jpg)
• Upon query, we should consider the exact values of the counters and not just their positiveness
• Can we design a deterministic scheme that exploits the exact values of the counters?
• Idea: Use variable increments to encode the element identity
Intuition for Variable Increments
14
0381052010 12
zy
![Page 16: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/16.jpg)
• Each hash entry contains a pair of counters:o , fixed increments → number of elements in entry (as in CBF)o , variable increments → weighted sum of elements
o weights from a pre-determined set
Architecture
15
34 9 6 2626 17 210 25
5 3 3 42 30 3c1
c2
2 7 8 94 5 61 3
2
• We use two sets of hash functions:o The first set uses hash functions with range
, i.e. it points to the set of entries.o The second set uses hash functions with
range , i.e. it points to the set .
![Page 17: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/17.jpg)
• Insertion:At each entry , the two counters are updated as follows.
o o from the set
• Example 1:
Insertion
16
34 9 13 2617 17 210 25
5 3 3 42 30 3c1
c2
2 7 8 94 5 61 3
x
+4+8
2
z
+4+13
![Page 18: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/18.jpg)
• Query ( with )
• We ask whethero 17 can be a sum of 2 elements from the set including 4o 30 can be a sum of 3 elements from the set including 8
• No: • How should we pick the set of variable increments?
Query
17
y
We should use Sequences!
34 30 13 2617 30 210 25
5 4 3 42 30 3c1
c2
2 7 8 94 5 61 3
3
y?
8?4?
![Page 19: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/19.jpg)
• Definition 1:Let be a sequence of positive integers.
Then, is a sequence iff all the sums
with are distinct.
• Example 2:
All the sums of elements of are distinct:
Therefore, is a sequence. • sequences are widely used in error-correcting codes.
Bh Sequences
18
![Page 20: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/20.jpg)
The Bh-CBF Scheme Query
19
• Example 3: is a sequence
o Since , then the Bh-CBF can determine that
34 30 13 2617 30 210 25
5 4 3 42 30 3c1
c2
2 7 8 94 5 61 3
X?
1?
3
4?
![Page 21: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/21.jpg)
• Example 3: is a sequence
The Bh-CBF Scheme Operations
19
o Here, and then necessarily
Since , the Bh-CBF can determine that
34 30 13 2617 30 210 25
5 4 3 42 30 3c1
c2
2 7 8 94 5 61 3
X?
1?
3
4?
The Bh-CBF Scheme Query
y?
8?4?
![Page 22: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/22.jpg)
• Example 3: is a sequence
The Bh-CBF Scheme Operations
19
o Since , the Bh-CBF cannot exclude that
34 30 13 2617 30 210 25
5 4 3 42 30 3c1
c2
2 7 8 94 5 61 3
X?
1?
3
4?
z?
4? 13?
The Bh-CBF Scheme Query
y?
8?4?
![Page 23: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/23.jpg)
• Internet trace (equinix-chicago) with real hash functions.
For the Bh-CBF, (with ).
20
Experimental Results
![Page 24: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/24.jpg)
• The Bloom Paradoxo Discovery of the Bloom paradoxo Importance of the a priori membership probability
• The Variable-Increment Counting Bloom Filtero Can extend many variants of the counting Bloom filtero First time sequences are presented in networking applications
Concluding Remarks
21
![Page 25: The Bloom Paradox Ori Rottenstreich Joint work with Yossi Kanizo and Isaac Keslassy Technion, Israel.](https://reader036.fdocuments.us/reader036/viewer/2022062722/56649f395503460f94c567d6/html5/thumbnails/25.jpg)
Thank You