Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter,...

35
Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy Presented by: Presented by: Anup Goyal Anup Goyal Edward Merchant Edward Merchant

Transcript of Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter,...

Page 1: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

Worm Origin Identification Using Random Moonwalks

Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang

2005 IEEE Symposium on Security and Privacy

Presented by: Presented by:

Anup GoyalAnup Goyal

Edward MerchantEdward Merchant

Page 2: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

2

Outline Motivation/IntroductionMotivation/Introduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 3: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

3

Outline Motivation/IntroductionMotivation/Introduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 4: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

4

Motivation Little automated support for identifying the

location from which an attack is launched.

Knowledge of the origin support law enforcement.

Knowledge of the casual flow that advance attack supports diagnosis of how network defense is breached.

Page 5: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

5

Introduction

We craft an algorithm that determines the origin of epidemic spreading attacks.

identify the “patient zero” of the epidemic reconstruct the sequence of spreading

Page 6: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

6

Introduction (cont’d)

Random moonwalk algorithm - Find the origin and propagation paths of a worm attack.

performs post-mortem analysis on the traffic records logged by the network.

It depends on the assumption that worm propagation occurs in a tree-like structure.

Page 7: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

7

Outline IntroductionIntroduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 8: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

8

Problem Formulation

Page 9: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

9

Problem Formulation (cont’d)

A directed host contact graph G = (V, E)G = (V, E) V = H × TV = H × T

HH is the set of all hosts in the network TT is time

Each directed edge represents a network flow between two end hosts at certain time. flow has a finite duration, and involves transfer

of one or more packets. e = (u, v, te = (u, v, tss, t, tee))

Page 10: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

10

Problem Formulation (cont’d)

normal edge The flow does not carry an infectious payload.

attack edge The flow carries attack traffic, whether or not the

flow is successful. causal edge

The flow that actually infect its destination.

Goal - Identify a set of edges that are edges from the top level of the casual tree.

Page 11: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

11

Outline IntroductionIntroduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 12: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

12

Random Moonwalk Algo. Causal relationship between flows by exploiting the

global structure of worm attacks No use of attack content, attack packet size, or port

numbers

For attack progress, there has to be a communication link between source of the attack and compromised nodes

This infection causing communication flows form a causal tree, rooted at the source of attack.

Find the tree and root is the source of attack Find causal flows and attack flows

Page 13: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

13

Random Moonwalk Algo. Basic Algorithm

Go backward from every node for certain distance.

At each node choose only the flows which are within certain time limit

Do it Z number of times Find the edges with highest frequency Create a tree for these flows

Most probably this is the causal tree and root is the source of attack

Page 14: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

14

Random Moonwalk Algo. (cont’d)

Sampling process controlled by three parameters

W – the number of walks (samples) performed.

D – maximum length of the path traversed.

Δt Δt - - sampling window size, max. time allowed between two consecutive edges

Page 15: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

15

Random Moonwalk Algo. (cont’d)

Why this algorithm works ?

To propagate, sometime after infection, worm creates a new flows to other hosts.

This forms a link from source to last victim

Traverse this link backward and find the source

An infected host generally originates more flows than it receives.

The originators host contact graph are mostly clients. Normal edges have no predecessor within ΔtΔt.

Page 16: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

16

Outline IntroductionIntroduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 17: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

17

Outline

Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model

AssumptionsAssumptions Edge Probability DistributionEdge Probability Distribution False Positives and False NegativesFalse Positives and False Negatives Parameter SelectionParameter Selection

Real Trace StudyReal Trace Study Simulation StudySimulation Study

Page 18: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

18

Analytical Model (Assumptions)

The host contact graph is known. |E||E| edges and |H||H| hosts

Discretize time into units. Every flow has a length of one unit and fits into one unit.

Page 19: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

19

Analytical Model (Probability)

Page 20: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

20

Analytical Model (FP & FN)

(42 malicious edges at k = 1.) (Total 105 host.)

Page 21: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

21

Outline Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study

Detect the Existence of an AttackDetect the Existence of an Attack Identify Casual Edges & Initial Infected HostIdentify Casual Edges & Initial Infected Host Reconstruct the Top Level Casual TreeReconstruct the Top Level Casual Tree Parameter SelectionParameter Selection PerformancePerformance

Simulation StudySimulation Study

Page 22: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

22

Real Trace Study Background Traffic

Traffic trace was collected over a 4 hour period at backbone of a class-B university network.

collect intra-campus flows only (1.4 million) involving 8040 hosts

Addition Add flow records to represent worm-like traffic

with vary scanning rate randomly select the vulnerable hosts.

Page 23: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

23

Real Trace Study (Existence)

Page 24: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

24

Real Trace Study (Identify)

(800 causal edges from 1.5*106 flows)(The scanning rate of Trace-50 is less than Trace-10.)

Page 25: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

25

Real Trace Study (Identify)

Top frequent sampling v.s. Actual initial edges

(total 800 causal edges, initial 10% are the first 80 edges)(The scanning rate of Teace-50 is less than Trace-10.)

Page 26: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

26

Top 60, Trace-50, 104 walks

Blaster Worm scan

Original Attacker

Page 27: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

27

Real Trace Study (Parameter)

dd and ΔtΔt

d = infinite

Page 28: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

28

Real Trace Study (Performance)

Random moonwalk Z = 100, 104 walks

Heavy-hitter Find 800 hosts with largest number of flows in

the trace, random pick 100 flows Super-spreader

Find 800 hosts contacted the largest number of destination, randomly pick 100 flows

Oracle With zero false positive rate, randomly select

100 flows between infected hosts

Page 29: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

29

Real Trace Study (Performance)

Page 30: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

30

Real Trace Study (Performance)

Scanning Method Smart worm (always scan valid hosts), R↑R↑ Scan with random address

C: casual edgeA: attack edge100: Z=100500: Z=500

Page 31: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

31

Outline Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study

Page 32: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

32

Simulate different background traffic Realistic host contact graphs tend to be much

sparser, meaning the chance of communication between two arbitrary hosts is very low.

Simulation StudySimulation Study

p.s. in campus network,the accuracy is about 0.7

Page 33: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

33

Outline IntroductionIntroduction Problem FormulationProblem Formulation The Random Moonwalk AlgorithmThe Random Moonwalk Algorithm Evaluation MethodologyEvaluation Methodology Analytical ModelAnalytical Model Real Trace StudyReal Trace Study Simulation StudySimulation Study Deployment and Future WorkDeployment and Future Work

Page 34: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

34

Deployment and Future Work

This approach assumes that the availability of complete data. the missing data on performance the deployment of the algorithm

Page 35: Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy.

35

Questions ????

Thank You