Distributed Quota Enforcement for Spam Control Jee Whan Choi Chaoting Xuan.

Post on 17-Jan-2016

217 views 0 download

Tags:

Transcript of Distributed Quota Enforcement for Spam Control Jee Whan Choi Chaoting Xuan.

Distributed Quota Enforcement for Spam Control

Jee Whan ChoiChaoting Xuan

Contents

Introduction Distributed Quota Enforcement (DQE) DQE Architecture Enforcer Design Evaluation Conclusions

Introduction

SPAM– Unsolicited Bulk Email– 50-70% of email today is SPAM

SPAM Filters– Email text scanning– Rate of false positive is approximately 1%– Economic damage estimated at 100’s of millions of

dollars Distributed Quota Enforcement (DQE)

– Quotas on the # of mails a sender can send

Distributed Quota Enforcement

Design Objectives– Protocol

No False Positives Untrusted Enforcer Privacy

– Enforcer Scalability Fault Tolerance High Throughput Attack-Resiliency Mutually Untrusting Nodes

Architecture

Quota Allocation and Creation

Quota Allocation– Quota allocated by select few globally

trusted quota allocators (QA)Cs = { Spub, expiration time,

quota }QApriv

Stamp– Created by the sender

Stamp = { Cs, {i,t}Spriv }

Stamp Cancellation Protocol

Protocol Objectives

False Positives– Hash is unique and one way

Untrusted Enforcer– Returns a proof of reuse (fingerprint)

Privacy– Hash of the stamp is used instead of the stamp itself

An adversary cannot cancel a victim’s stamp before it is created– Stamp contains Sender’s private key

Enforcer

Comprises of thousands of untrusted storage nodes

Enforcer stores the fingerprints of stamps cancelled in the current and previous epochs

List of approved nodes are published by a trusted authority (Bunker)

Node receiving the client’s request is called the portal for that request– A client can discover a portal via hard-coding or DNS

Enforcer Design

TEST

TEST– Local check– If not found, sequentially send request to

other nodes (assigned-nodes) Assigned-nodes are determined by k and r

independent hash functions, similar to Chord. r is configurable system parameter

– If any node contains k’s value, return it, otherwise return “not found”

SET

SET– Local store– Also store the value in a randomly chosen

node from assigned-nodes

TEST and SET Algorithm

Stamp Reuse and Fault Tolerance

False negative is possible. Byzantine faults and crash faults are the same

– Outcome of adversarial nodes giving false negatives (not-found response) are the same a nodes not responding (crash fault)

Depends on the parameters r and p– p – fraction of n total machines that fail during a 2 day

cycle– Expected number of times a stamp is used before stamp’s

fingerprint has been placed on a good node - 1/(1-2p)+pr*n

– If we assume r = 1+log1/pn, use = 1+3p = 1.3 for p = 0.1

Improvement of Fault Tolerance (our speculation)

Randomly chose two or more nodes from the assigned nodes to store the (key, value) pair in the PUT algorithm.

Increase the overall storage usage, but significantly improve the stamp reuse detection rate.

GET and PUT

GET and PUT (Continue)

PUTs are fast Crash recovery of previously cancelled

keys Key-value pairs are small in size “Not Found” answers are almost always

fast “Found” answers are slow

Avoiding Distributed Livelock

Distributed Pipeline:1. TEST/SET requests from clients.2. GET/PUT requests from other enforcer nodes. 3. GET/PUT responses.

Drop the beginning of a pipeline to maximize throughput.

Resource Exhaustion Attacks

Attacks: flood of spurious TEST/SET requests.

Assumption: Attackers (or zombies they control) have some bandwidth limit.

Solution: Max out attackers’ bandwith by requiring large size or multiple copies of TEST/SET packets .

Performance Evaluation

Performance Evaluation (Continue)

Enforcer Size1. 100 billion emails daily 2. 65% spam 3. 65 billion disk seeks / day (pessimistic)4. 400 disk seeks/second/node 5. 86400 seconds/day

1881 nodes (3GHz CPU, 1G RAM, 3 Mbits/sec Bandwith)

Performance Evaluation (Continue)

Question ?