Detection of Interactive Stepping Stones
Shobha [email protected]
Joint work with Avrim Blum & Dawn Song
Carnegie Mellon University
ICML Workshop 2006
June 29, 2006
Stepping Stone
stepping-stone attack: attacker uses chain of compromised machines to reach victim
Difficult to find attacker from looking only at victim Victim only sees the last host in chain
Attacker
A
X1
Victim
V
Xk
Why stepping-stones? Stepping-stones attractive to attackers
Ease of compromising hosts on Internet Difficulty of detection
Don’t know when host is compromised Only know when there is attack
Don’t know who compromised Chaos and volume of Internet traffic Not always logged
True attacker almost untraceable: near-perfect way to achieve anonymity!
Large-scale stepping-stones: botnets…
Botnets: “For sale, stepping-stones” Botnets: Set of compromised hosts controlled by a
single “command-center” How this works:
individual hosts compromised “control priveleges” sold to other attackers, who use them launch
attacks. Nearly impossible to discover true attackers Extremely prevalent on Internet
Logs at CMU dept: discovered Gaobot infection Across 6-7 months of traffic (everything we examined!) Across 100+ hosts (1/10th network) at peak infection
Botnets (II)CMU
Stanford
PittsburghVerizon DSL
VICTIM
Attacker #1
Attacker #2
General Stepping-Stone Detection
Extremely difficult: Indefinite delay between stepping-stone “legs” Traffic too voluminous/insufficiently logged for
traceback Packets encrypted or padded between “legs” Stepping-stone “legs” additionally masked by
adding superfluous traffic (“chaff”)
Restricting the Problem Restrictions:
Traffic monitoring done at routers/gateways Interactive stepping-stone streams
Bounded delay between stepping-stones
A
X1 X2
VS2 S1
M1M2
InternetT = 0 T
Restricting the Problem (II)
Restrictions put together: observe 2 time-delayed streams at monitor, are they a stepping-stone pair?
If attacker uses no chaff If attacker uses chaff
A
X1 X2
VS2 S1
InternetT = 0 T
Prior work Donoho, Flesia, Shankar, Paxson, Coit & Staniford, RAID 02
Assumptions needed: Attack stream from Poisson or Pareto distribution Normal users perfectly uncorrelated
No guarantees on monitoring time or false positives Wang & Reeves, CCS 03
Assumptions needed: Timing perturbation of packets iid [strong assumption] No chaff
Scheme breaks without assumptions
Other related work: [SH95, YE00, ZP00, WRWY01, WRW02, W04]
Our work Want to allow correlations among normal users
Don’t flag just any correlated pair Time-correlated pair != stepping-stone pair
Use milder assumptions Model non-attack streams as sequences of Poisson
processes No additional assumption on attacker Allow chaff
Present algorithms and analysis for these models`
Inspiration from learning theory
Learning Theory Question:
How many examples do we need to see before we can identify hypotheses with guaranteed confidence?
Our Question:
How many packets do we need to see before we can identify normal/attack streams with guaranteed confidence?
Rest of talk: answer this question…
Outline
Problem definition Without chaff
Simple Poisson model Generalized Poisson model
With chaff Algorithms Hardness of detection results
Conclusions
Problem Definition (I) Set-up: stepping-stone monitor tracks no. of packets in
streams S1, S2 at a given time t : N1(t), N2(t) Assumptions:
Packets correspond 1-1 on stepping-stone streams(without
chaff)
Max tolerable delay bound exists Max no. packets attacker can send in time exists: p
Our bounds will be in terms of p.
Problem Definition (II) For stepping-stone streams S1 & S2 :
1. Every packet on S2 comes from S1
N1(t) N2(t)
2. Every packet on S1 appears on S2 within time
N1(t) N2(t + ) Assumptions on normal streams next… Detect stepping-stone pairs with guarantees on:
Monitoring time Mtotal packets observed on both streams before detection
False-positive probability
Simple Poisson Model Assumptions:
Normal stream: Poisson process with fixed rates (generalize this later)
p is known (relax this later). No chaff (generalize this later).
Outline: Algorithm Analysis sketch Relax knowledge of p
Algorithm
Algorithm Observe y packets on union of streams S1 and S2
Compute difference in no. of packets d = N1 – N2. If d is not in [- p, p, return NORMAL Repeat over x iterations the above procedure Return ATTACK if d lies in [-p, p throughout
Thm: with x = log 1/, y = 2(p2 Monitoring time M = xy = O(p
2 log 1 packets False positives <
S2S1
Analysis (I) Overhead:
Only per-stream packet counters running all the time! Compute sums & differences for pairs once in a while Algorithm needs NO knowledge of Poisson rates
Any stepping-stone pair sending M packets reported
For stepping-stone pair, d within [-p, p] If |d| > p,some packet violates max delay bound
Ensure that false positive probability less than i.e. d leaves [-p, p] with probability more than 1 - When d leaves [-p, p], algorithm returns “normal”
Analysis (II) Streams S1 and S2 Poisson processes with rates 1, 2
(normalized so that 1 + 2 = 1) On union of streams, each packet:
1 chance of coming from S1, 2 chance of coming from S2
Stream 1 Stream 2
Time
Analysis (III)
Every time packet appears on S1 S2
Z = Z + 1 with probability 1
Z = Z - 1 with probability 2
Thus, Z equivalent to 1-d random walk Need Z to exit [-p, p] after some steps
10-1-2 2
2
1 1
22 2Z 1 0 1 2 1 0 1 0 0 1 01
Let Z be the difference in no. of packets on S1 and S2
Stream 1 Stream 2
Analysis (IV)
Fact: 1-d random walk exits bounded region of length t in expected O(t2) time! Therefore,
When n = O(p2) ,
Pr[Z will stay in bounded region] < 1/2 Repeat for m = log 1/ iterations
Pr[Z will stay in bounded region] < When Z exits bounded region, normal pair does not get falsely accused. Done!
2
1 1
22 2
10-1-2 2
What if p is unknown?
What if we do not know p? Use “guess and double” strategy.
Set pj = 2j.
Run algorithm over sequence of pj: p1, p2, …
When a pair is “cleared” for pj, examine it with respect to pj+1..
What if p is unknown? For stepping-stone pair, increases monitoring time by
O(log log p). Guarantee depends only on true value of p!
In practice, set upper bound for p Normal streams monitored until upper bound reached
As j increases, test differences exponentially less often Fundamental problem: cannot distinguish between normal pair
and attack pair with longer delay bound
Summary: Simple Poisson Normal streams: Poisson process with single
fixed rate. Algorithm with guaranteed false positives and
monitoring time Algorithm needs no knowledge of Poisson rates
Analysis extended When p is unknown When false positive probability is distributed over all
pairs of streams: in paper
Outline Problem definition Without chaff
Simple Poisson model Generalized Poisson model
With chaff Algorithms Hardness of detection results
Conclusions
Generalized Poisson model Model normal process as SEQUENCE of Poisson
processes: varying rates for varying time periods
i.e. stream given by: (1, t1), (2, t2), … General model: coarsely approximate almost any usage
pattern, for example: Coarsely simulate Pareto distributions – good model of typing
patterns Correlated users: same sequence of Poisson rates &time
intervals
Analysis Sketch Formally, a stream S is given by: (1, t1), (2, t2), … Key observation:
At time T, packet distribution equivalent to Poisson process with single fixed rate j (j . tj)/T (weighted mean)
More details in paper.
2
1 1
22 2
10-1-2 2
Summary: General Poisson Normal streams modelled as sequences of Poisson
processes: (1, t1), (2, t2), …
Very general model Algorithm with guarantees on monitoring time and false
positive rate Once again, algorithm needs no knowledge of Poisson rates
Results in this model extended similarly: When p is unknown When false positive probability is distributed over all pairs of
streams
Outline Problem definition Without chaff
Simple Poisson model Generalized Poisson model
With chaff Algorithms Hardness of detection results
Conclusions
Chaff
Algorithms (as presented) broken by single packet of chaff Next, modify algorithms to handle limited chaff…
Attacker Victim
Stepping Stone
Chaff: dummy packets inserted in traffic streams to avoid detection
Chaff: Algorithms Fix chaff rate, but chaff arbitrarily distributed Simple Poisson model
Algorithm: Let y be number of packets needed before we exit bounded
region in random walk. Allow chaff rate of p/4y, monitor for difference to leave [-2p, 2p] Regular streams get difference (wait longer)
Can tweak algorithm to handle slightly higher chaff rate, but that’s all. Hardness results next…
Extends similarly to general Poisson model.
Hardness of Detection No algorithm based on timing delays alone can detect
stepping-stones with smart use of chaff Can give bounds on chaff needed so attacker can
pre-generate two independent processes send packets to mimic independent processes exactly Details & strategies in paper
If attacker can actively send such chaff, detection requires use of other information
Summary Algorithms to detect stepping stones:
Guarantees on monitoring time and false positives Simple and generalized Poisson models With and without (arbitrarily distributed) chaff When p is known/unknown
Compared to previous work: Milder assumptions, allow for substantial correlation among
normal users No additional assumptions on attacker (besides delay bound)
With sufficient chaff, attacker can mask stepping stones, so that no algorithm that uses inter-packet delays can detect them.
Prior Work
Top Related