Distributed Quota Enforcement for Spam Control Jee Whan Choi Chaoting Xuan.
-
Upload
magnus-weaver -
Category
Documents
-
view
217 -
download
0
Transcript of Distributed Quota Enforcement for Spam Control Jee Whan Choi Chaoting Xuan.
Distributed Quota Enforcement for Spam Control
Jee Whan ChoiChaoting Xuan
Contents
Introduction Distributed Quota Enforcement (DQE) DQE Architecture Enforcer Design Evaluation Conclusions
Introduction
SPAM– Unsolicited Bulk Email– 50-70% of email today is SPAM
SPAM Filters– Email text scanning– Rate of false positive is approximately 1%– Economic damage estimated at 100’s of millions of
dollars Distributed Quota Enforcement (DQE)
– Quotas on the # of mails a sender can send
Distributed Quota Enforcement
Design Objectives– Protocol
No False Positives Untrusted Enforcer Privacy
– Enforcer Scalability Fault Tolerance High Throughput Attack-Resiliency Mutually Untrusting Nodes
Architecture
Quota Allocation and Creation
Quota Allocation– Quota allocated by select few globally
trusted quota allocators (QA)Cs = { Spub, expiration time,
quota }QApriv
Stamp– Created by the sender
Stamp = { Cs, {i,t}Spriv }
Stamp Cancellation Protocol
Protocol Objectives
False Positives– Hash is unique and one way
Untrusted Enforcer– Returns a proof of reuse (fingerprint)
Privacy– Hash of the stamp is used instead of the stamp itself
An adversary cannot cancel a victim’s stamp before it is created– Stamp contains Sender’s private key
Enforcer
Comprises of thousands of untrusted storage nodes
Enforcer stores the fingerprints of stamps cancelled in the current and previous epochs
List of approved nodes are published by a trusted authority (Bunker)
Node receiving the client’s request is called the portal for that request– A client can discover a portal via hard-coding or DNS
Enforcer Design
TEST
TEST– Local check– If not found, sequentially send request to
other nodes (assigned-nodes) Assigned-nodes are determined by k and r
independent hash functions, similar to Chord. r is configurable system parameter
– If any node contains k’s value, return it, otherwise return “not found”
SET
SET– Local store– Also store the value in a randomly chosen
node from assigned-nodes
TEST and SET Algorithm
Stamp Reuse and Fault Tolerance
False negative is possible. Byzantine faults and crash faults are the same
– Outcome of adversarial nodes giving false negatives (not-found response) are the same a nodes not responding (crash fault)
Depends on the parameters r and p– p – fraction of n total machines that fail during a 2 day
cycle– Expected number of times a stamp is used before stamp’s
fingerprint has been placed on a good node - 1/(1-2p)+pr*n
– If we assume r = 1+log1/pn, use = 1+3p = 1.3 for p = 0.1
Improvement of Fault Tolerance (our speculation)
Randomly chose two or more nodes from the assigned nodes to store the (key, value) pair in the PUT algorithm.
Increase the overall storage usage, but significantly improve the stamp reuse detection rate.
GET and PUT
GET and PUT (Continue)
PUTs are fast Crash recovery of previously cancelled
keys Key-value pairs are small in size “Not Found” answers are almost always
fast “Found” answers are slow
Avoiding Distributed Livelock
Distributed Pipeline:1. TEST/SET requests from clients.2. GET/PUT requests from other enforcer nodes. 3. GET/PUT responses.
Drop the beginning of a pipeline to maximize throughput.
Resource Exhaustion Attacks
Attacks: flood of spurious TEST/SET requests.
Assumption: Attackers (or zombies they control) have some bandwidth limit.
Solution: Max out attackers’ bandwith by requiring large size or multiple copies of TEST/SET packets .
Performance Evaluation
Performance Evaluation (Continue)
Enforcer Size1. 100 billion emails daily 2. 65% spam 3. 65 billion disk seeks / day (pessimistic)4. 400 disk seeks/second/node 5. 86400 seconds/day
1881 nodes (3GHz CPU, 1G RAM, 3 Mbits/sec Bandwith)
Performance Evaluation (Continue)
Question ?