Blacklisting and Blocking Sources of Malicious Traffic

Blacklisting and Blocking Sources of Malicious Traffic Sources of Malicious Traffic

Athina MarkopoulouUni sit f C lif ni I inUniversity of California, Irvine

Joint work with Fabio Soldo, Anh Le @ UC Irvine Jo nt work w th Fab o Soldo, Anh Le @ UC Irv ne and Katerina Argyraki @ EPFL

1

OutlineOutline

MotivationMot vat onMalicious Internet Traffic: Attack and Defense

Two Defense Mechanisms Proactive: Predictive Blacklisting

d F lReactive: Source-Based Filtering

C l siConclusion

2

Malicious Traffic on the Internet

Compromising systems

Malicious Traffic on the Internet

p g yscanning, worms, website attacksphishing, social engineering attacks....

Launching attacksspamclick fraudclick-fraudDenial-of-Service attacks…

B t tBotnetslarge groups of compromised hosts, remotely controlled

3

The solution requires many components

Monitoring and detection of malicious activity

The solution requires many components

Monitoring and detection of malicious activity– in the network and/or at hosts– signature-based, behavioral analysis

Mitigation – at the hosts: remove malicious code– in the network: block, rate-limit, scrub malicious traffic

Internet architecture Internet architecture

4

Defense at the edge of the networkDefense at the edge of the network

N k 1 Network 2Network 1 Network 2

Logging IDS Firewall Logging IDS Firewallrouter router

L i IDS Fi ll

Network 3 Network 4

Logging IDS FirewallLogging IDS Firewall

Our focus is on (1) blacklisting and (2) blocking malicious traffic5

Dshield Dataset

6 months of IDS+firewall logs from Dshield.org (May-Oct 2008):

Dshield Dataset

6 months of IDS firewall logs from Dshield.org (May Oct 2008)~600 contributing networks, 60M+ source IPs, 400M+ logs

Contributing network

Dshield.org

Logs Time Victim ID Src IP Dst IP Src Port Dst Port Protocol FlagsLogs Time Victim ID(contributor)

Src IP Dst IP Src Port Dst Port Protocol Flags

P h f d d l d b hPros: huge amount of data, diverse sample, used by many researchersCons: no detailed information on alerts, may include errors

6

OutlineOutline

BackgroundBackgroundMalicious Internet Traffic: Attack and Defense

Two Defenses Mechanisms Proactive: Predictive Blacklisting

d F lReactive: Source-Based Filtering

C l siConclusion

7

Predictive Blacklisting

Problem definition:

Predictive Blacklisting

Problem definition: – Given past logs of malicious activity collected at various

locationsP di t lik l t d li i t ffi t h i ti – Predict sources likely to send malicious traffic to each victim network in the future.

Blacklist: – list of “worst” (e.g. top-100) attack sources

Prediction vs DetectionPrediction vs. Detection

8

Data analysisSuperposition of several behaviors

Data analysiser

tsm

ber

of a

l

D

Nu

Source (“Attacker”) IPDay

9

A multi-level prediction model

Different predictors capture different patterns in

A multi-level prediction model

Different predictors capture different patterns in the dataset: – Model temporal dynamics

M d l i l l i b i i / k– Model spatial correlation between victims/attackers

Combine different predictorsComb ne d fferent pred ctors

Formulate as a Recommendation Systems problem– in particular collaborative filtering

10

Recommender systems: example

Netflix: you rate movies and you get suggestions

Recommender systems: example

11

Formulating Predictive Blacklisting

Recommendation System Predictive Blacklistingas a Recommendation System (CF)

3 2 ? ? 13 4 ?

AttackersUsers

3 2 ? ?

1 ? ? 4

- 13 4 ?

? - 3 ?- ? ? 1

? - 12 1- 7 ? 1

? ? ? ?ims

ms

6 3 1 9

? ? 2 ?

? 11 - 2

3 8 ? -

? - 12 1

4 ? - 273 - ? 9

1

? ? ? ?

? ? ? ?Vi

cti

Item

? ? 2 ?

R = Rating Matrix

8 ?2 ? 6 -? 21 - ?

11 2 ? -? ? ? ?

? ? ? ?User Attack? ? ? ?Userrating

Attackvolume

Goal: predict rating matrix: ra,v(t) 12

Predictor I: (attacker, victim) pairT l d iTemporal dynamics

)(, trTSva

Data analysis: attacks from the same source within short time

13

Predictor I: (a v) time seriesPredictor I: (a, v) time series)(, trTS

va

Data analysis: repeated attacks within short time periodsPrediction:

– Use EWMA model to capture this temporal trendp p– Accounts for the short memory of attack sources.– Computationally efficient– Includes as special case t=1

Past activityat time t’ ≤ t

Predicted activity

14

Predictor II: similar victims

Data analysis: victims share common attackers.

spatial correlation

– [Katti et al, IMC 2005], [Zhang et al, Usenix Security 2008]

C Our approach:

Common attackers

Victims

15

Predictor II: similar victimsdefining similarity

• Similarity of victims u,v captures:y p– the number of common attackers– and when they are attacked

C Our approach:

1 1 0 0v1a1 a2 a3 a4

Common attackers

1 1 0 0

1 1 0 0

1 1 1 0

v2v3

victims

0 0 1 1v416

Predictor II: similar victimsk-nearest neighbors (kNN)

)(, tr KNNva

Traditional kNN: “trust” your peers– Identify k most similar victims (“neighbors”) + predict your rating based on theirs

N h ll d i i iNew challenges due to time varying ratings

Sum over the

Our approach: Sum over the

neighborhood of v

Time series forecastgiven past logs

Predicted activity

given past logs

Similarity between ytime-varying vectors

17

Predictor III: Attackers-Victims l

Data analysis:

Co-clustering

– group of attackers consistently target the same group of victims.– this behavior often persists over time

We used the Cross-Association (CA) method to automatically identify dense clusters of victims-attackers.

18

Predictor III: Attackers-Victims P d Prediction

)(, tr CAEWMAva

−

Intuition:– pairs (a,v) in dense clusters are more likely to occur– use the density of the cluster, as the predictor

, where

EWMA-CA: further weight by persistence over time

19

A multi-level prediction modelpSummary

Different predictors capture different patterns: – Temporal trends

EWMA TS of (attacker victim) • EWMA TS of (attacker,victim) – Neighborhood models:

• KNN: Similarity of victims• EWMA CA: Interaction of attackers-victims

Combine different predictorsCombine different predictors

20

Combining different predictors

W i ht d A

Combining different predictors

Weighted Average – with weights proportional to the accuracy of each predictor on a pair (a,v).

21

Performance AnalysisB li Bl kli i T h iBaseline Blacklisting Techniques

• Local Worst Offender List (LWOL)• Local Worst Offender List (LWOL)– Most prolific local attackers– Reactive but not proactive

• Global Worst Offender List (GWOL)• Global Worst Offender List (GWOL)– Most prolific global attackers– Might contain irrelevant attackers– Non prolific attackers are elusive to GWOL

• Collaborative Blacklisting (HPB)– [J. Zhang, P. Porras, J. Ullrich, “Highly Predictive Blacklisting”, USENIX Security 2008]– Also implemented and offered as a service (HPB) by Dshield.org– Methodology: Use link-analysis on the victims similarity graph to predict future attacks

22

Performance Analysis

60 d f D hi ld l 5 d t i i 1 d t ti BL l th 1000

total hit countPerformance Analysis

60 days of Dshield logs, 5 days training, 1 day testing, BL length=1000, The combined method

– significantly improves the hit count (up to 70%, 57% on avg)– exhibits less variation over timeexhibits less variation over time

Combined method

HPBHPB

GWOL

23

Predicting Attacksh i h b d ?what is the best we can do?

Training, day t1 Test, day t2

12 - 1 33 5 - - 3 5 - 17 4 - -viLocalUB(vi)=3

Local Upper Bound: #IPs in training & test window of a particular contributor

2 - 1 1 - - -

12 - 1 33 5 - -

- - 7 - 3 29 6

- 1 - - 5 - -

3 5 - 17 4 - -

1 2 - 1 5 31 4

- - - - 2 - - 1 - - 2 4 - -

x - x x x x x x x - x x x x GlobalUB=5

Global Upper Bound: # IPs in training window of any contributor

24

Global Upper Bound: # IPs in training window of any contributor

Predicting AttacksPredicting Attacksroom for improvement

Collaboration helps!

Large gap from prior methodsOur method (|BL|=1000)

25

Performance Analysis

Robustness achieved by diverse methods

yrobustness to random errors

E.g. an attacker may send traffic to a single victim (detected by temporal) or to several victims (detected by spatial behavior); or he can limit his attack activity

26

Predictive Blacklisting as a RS System

b

Summary Predictive Blacklisting as a RS System

Contributions– Combined predictors that capture different patterns in the data– Significant improvement with simple techniques

• still room for further improvement• still room for further improvement– New formulation as a recommenders system (collaborative filtering) problem

• paves the way to powerful techniques: • e.g., capture global structure (latent factors), joint spatio-temporal models

References– F.Soldo, A.Le, A.Markopoulou, "Predictive Blacklisting as an Implicit

Recommendation system“, IEEE INFOCOM 2010 and in arXiV.org– In the news: MIT Technology Review, Slashdot, ACM TechNews

27

How to use a list of malicious sources?How to use a list of malicious sources?

• A policy decision:– E.g. scrub, give lower priority, block, monitor, do nothing …

• One option is to block (filter) malicious sources– when: during flooding attacks by million-node botnets– where: at firewalls or at the routers

28

OutlineOutline

BackgroundBackgroundMalicious Internet Traffic: Attack and Defense

Two Defenses Mechanisms Proactive: Predictive Blacklisting

l d F lReactive: Optimal Source-Based Filtering

C l siConclusion

29

Filtering at the routersFiltering at the routers

• Access Control Lists (ACLs)( )– Match a packet header against rules, e.g. source and

destination IP addresses– Source-based filter: ACL that denies access to a source Source based filter: ACL that denies access to a source

IP/prefix

l l • Filters implemented in TCAM– Can keep up with high speeds– Limited resource Limited resource

• There are less filters than attack sources

30

Filter Selection at a Single Routerd ff b f fil ll l dtradeoff: number of filters vs. collateral damage

cattackers

Filter an attack source A.B.C.D

. . . . . . . . .c

c cc cc

legitimate users

Filter a prefix A.B.C.*

ISP

edge routerC

edge router

V 31

Optimal Source-Based FilteringOptimal Source Based Filtering

Design a family of filter selection algorithms that:t k i t• take as input:

– a blacklist of malicious (bad) sources – a whitelist of legitimate (good) sources– a constraint on the number of filters Fmax– a constraint on the number of filters Fmax– a constraint on the access bandwidth C– the operator’s policy

• optimally select which source IP prefixes to filteroptimally select which source IP prefixes to filter– so as to optimize the operator’s objective – subject to the constraints

A B C *

0 2^32-1 A.B.C.D

A.B.C.

so far, heuristically done (through ACLs or rate limiters) 32

Optimal Source-Based Filtering p gA General Framework

[l,r]: range in the IP spaceg pp/l: prefix p of length lF max: number of filters (<<N)

: whether we block range [l r] or not: whether we block range [l,r] or not: weight assigned to source IP address, i.

: cost of blocking a range [l,r]

33

Optimal Source-Based Filtering E i O ’ P li Expressing Operator’s Policy

• Assignment of weights Wi is the operator’s knob:– indicates volume of traffic sent, or importance assigned by the operator– Wi>0 (good source i), Wi<0 (bad source i ), Wi=0 (indifferent)

• Objective function

=

=

cost of good sources in range [l,r]

cost of bad sources in range [l r]cost of bad sources in range [l,r]

34

Filter Selection AlgorithmsP bl O iProblem Overview

• RANGE-based: filter IP or range [l,r]g[Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA’09]

– FILTER-ALL-RANGE– FILTER-SOME-RANGE

FILTER ALL DYNAMIC RANGE– FILTER-ALL-DYNAMIC-RANGE

• PREFIX-based: filter IP source or prefix[Soldo, Markopoulou, Argyraki: INFOCOM’09, arXiv.org][Soldo, Markopoulou, Argyraki INFOCOM 09, arXiv.org]– FILTER-ALL: block all malicious sources– FILTER-SOME: block some malicious sources– FILTER-ALL-DYNAMIC: BL varies over time

FLOODING: b d idth st i t t ss t– FLOODING: bandwidth constraint at access router– DISTRIBUTED-FLOODING: filters at multiple routers

35

Filter Selection AlgorithmsAl ith O iAlgorithms Overview

• RANGE-based: filter IP or range [l,r]g[Soldo, El Defrawy, Markopoulou, Van De Merwe, Krishnamurthy: ITA’09]

– FILTER-ALL-RANGE– FILTER-SOME-RANGE

FILTER ALL DYNAMIC RANGE– FILTER-ALL-DYNAMIC-RANGE

• PREFIX-based: filter IP source or prefix[Soldo, Markopoulou, Argyraki: INFOCOM’09, arXiv.org][Soldo, Markopoulou, Argyraki INFOCOM 09, arXiv.org]– FILTER-ALL: O(N)– FILTER-SOME: O(N)– FILTER-ALL-DYNAMIC: O(N)

FLOODING: NP h d s d l i l l O(C2N) h isti– FLOODING: NP-hard, pseudo-polynomial alg. O(C2N) + heuristic– DISTRIBUTED-FLOODING: distributed solution

following a dynamic programming approachg y p g g pp

36

Longest Common Prefix Tree of a BLLongest Common Prefix Tree of a BL• LCP-Tree(BL) : binary tree, leaves are addresses in BL,

intermediate nodes are their longest common prefixesg p f• It can be found from the full binary tree of IP prefixes• E.g. for BL={10.0.0.2, 10.0.0.3, 10.0.0.7}, the LCP-Tree(BL) is:

10.0.0.2/31

10.0.0.0/29

3 bad, 5 good addresses10.0.0.2/31

10 0 0 2/32 10 0 0 3/32 10 0 0 7/32

0 good, 2 bad addresses

• Finding a set of filters:– no need to look for all possible sets of prefixes

10.0.0.2/32 10.0.0.3/32 10.0.0.7/32

no need to look for all possible sets of prefixes – sufficient to look only for prunings of the LCP tree– lends itself to a dynamic programming approach 37

Filter-All-PrefixP bl SProblem Statement

• Given: a blacklist BL, weight wi (for each good IP i), Fmax filters• choose: prefixes p/l (x /l)choose: prefixes p/l (xp/l)• so as to: filter all bad addresses and minimize collateral damage

38

Filter-All-PrefixD i P i Al i hDynamic Programming Algorithm

: cost of optimal allocation of F filters within a prefix pp p p

psL sRL sR

F-n ≥ 1,filters within left subtree

n ≥ 1,filters within right

subtree

39

n=1,1,…,F: means that we want to block all malicious sources (leaves)

Filter-All-PrefixP l h E l

Fmax = 4N = 10

DP Algorithm: Example

Fmax = 4

0/1

32/5

57/6 58/6

Filter-Some-Prefix

Fmax = 4N = 10Fmax = 4

32/5

57/6 58/63/6

N 10Filter-All-Prefix-Dynamic

Ti i Fmax = 4N = 10

Need to be

Time-varying case

(re)computed:O(Fmaxlog(N))

26

7

0 22

7 75

31 3710 15 17 22 32 33 57 583

6 6 0 2

42

FLOODINGP bl SProblem Statement

• Given: a blacklist BL, a whitelist WL, a weight of address = traffic volume generated weight of address = traffic volume generated, a constraint on the link capacity C, and Fmax filters

• choose: source IP prefixes, xp/l• so as to: minimize the collateral damage g

and fit the total traffic within the link capacity C

43

FLOODINGDP Al i hDP Algorithm

• FLOODING is NP-hard – reduction from knapsack with cardinality constraint (1.5K)

• An optimal pseudo-polynomial dynamic programming An optimal pseudo polynomial dynamic programming algorithm, solves the problem in: O((CFmax)2N)– similar to the previous DP but solve 2-dimensional KP

l– the LCP-Tree includes both good and bad addresses– DP extended to take into account the capacity constraint

• A heuristic, by adjusting the granularity (ΔC>1) of C44

Distributed Floodingfil l filters at several routers

attackers

• Deploy filters at several routers– increase total filter budget

E h ( ) h

. . . . . .

cc cc c

c

• Each router (u) has its own:– view of good/bad traffic– capacity in incoming link– filter budget

. . .

filter budget• Filtering at several routers:

– not only which prefix to block– but also on which router

• Solution:– can be solved in a distributed way

outperforms independent decisions Victim– outperforms independent decisions

45

Evaluation using Dshield dataFLOODING li i iFLOODING vs. rate limiting

• Attack sources, from a point of view of a single victim in Dshield• Good sources: [Kohler et al. TON’06, Barford et al. PAM’06]• Before attack: good traffic was C/10 < C• During attack: bad traffic is 10C g

CD

/N

46Optimal filter selection preserves the good traffic and drops the bad.

Intuition why optimization helpsy p pcompared to non-optimized filtering

• Malicious sources are clustered in the IP address spacep• Malicious sources are not co-located with legitimate sources

• Filtering can block IP prefixes with malicious sources, without penalizing (many) legitimate sources. 47

Evaluation using Dshield data (2)l lFILTER-ALL-PREFIX vs. generic clustering algorithms

• Malicious addresses:– attacking 2 specific victim networks (most and least clustered) in Dshield datasetg p ( )

• Good addresses generated:– using a multifractal [Kohler et al. TON’06, Barford et al. PAM’06]

48Optimal filter selection outperforms generic clustering

Evaluation using Dshield data (3)DISTRIBUTED FLOODING h l f di iDISTRIBUTED-FLOODING: the value of coordination

D/N

C

49Coordination among routers helps

Optimal Source-Based Filtering SSummary

F k f ti l filt l ti • Framework for optimal filter selection – defined various filtering problems – designed efficient algorithms to solve themg g

• Lead to significant improvements on real datasets– Compared to non-optimized filter selection , to generic

clustering, or to uncoordinated routers– because of clustering of malicious sources

50

OutlineOutline

BackgroundBackgroundMalicious Internet Traffic: Attack and Defenses

T D f M h Two Defenses Mechanisms Proactive: Blacklisting as a Recommendation SystemReactive: Filtering as an Optimization ProblemReactive: Filtering as an Optimization Problem

ConclusionConclusionParts of larger system that collects and analyzes data from multiple sensors and takes appropriate action

51

Thank you!Thank you!

[email protected]://newport.eecs/uci.edu/~athina

52

Blacklisting and Blocking Sources of Malicious Traffic

Documents

Transcript of Blacklisting and Blocking Sources of Malicious Traffic