A Distributed Stealthy Coordination Mechanism for Worm Synchronization Gaurav Kataria, Gaurav Anand,...

30
A Distributed Stealthy Coordination Mechanism for Worm Synchronization Gaurav Kataria, Gaurav Anand, Rudolph Araujo, Ramayya Krishnan and Adrian Perrig Carnegie Mellon University 2007.8.17 by Mike Hsia 2nd International Conference on Security and Privacy in Communication Networks (Securecomm), 2006

Transcript of A Distributed Stealthy Coordination Mechanism for Worm Synchronization Gaurav Kataria, Gaurav Anand,...

A Distributed Stealthy Coordination Mechanism for Worm Synchronization

Gaurav Kataria, Gaurav Anand,Rudolph Araujo, Ramayya Krishnan and Adrian PerrigCarnegie Mellon University

2007.8.17 by Mike Hsiao

2nd International Conference on Security and Privacyin Communication Networks (Securecomm), 2006

2

Outline

Introduction Related Work Background Worus Propagation Model Probabilistic Counting Simulating A P2P Network and Results Countermeasures Feature Work and Conclusions Comments

3

Introduction

Attack Hacker Automated tools Propagation Even quicker Self-stopping Slow and covert

propagation

Defense Administrator Automated detection Traffic analysis Early detection

Evolution

In this paper, the authors propose slow andcovert propagation (even in early stages ofworm infection) to avoid worm detectionuntil a critical mass of nodes is infected.

paradigm changes

4

Introduction: propagation via P2P

P2P file-sharing networks can provide a vast overlay network ideal for stealthy propagation. they are large and distributed networks that are neither managed nor controlled by anyone, due to enormous amount of data transfer that takes pla

ce on them, it is much easier to conceal malicious content and

messages as part of regular communication, and nodes connected to a P2P network by definition ar

e aware of other nodes on the network and hence do not have to randomly scan the Internet for vulnerable hosts.

5

Introduction: virus v.s. worm

Of the more than twenty thousand worms and viruses reported by Symantec in year 2005, only a handful were scanning-based self-propagating worms that exploited a software vulnerability[27]. The majority of malware discovered was viruses (email,

P2P or instant messaging) It can be difficult to detect a virus if it does not exhibit a

ny unusual (host-based) behavior. In theory

a P2P virus can persist under the radar as long as it behaves like a normal P2P application,

surreptitiously using the P2P overlay network itself for distributed coordination with other infected nodes.

[27] D. Turner, et al. “Symantec Internet Security Threat Report,” Volume VIII,20330 Stevens Creek Road, Cupertino CA: Symantec Corporation, 2005.

6

Introduction: worus The authors present a stochastic model for P2P

virus propagation and coordination that once a critical mass of nodes get infected,

all covert nodes become aware of that with high probability and can drop their cover

to simultaneously launch a usual scanning-based Internet-wide worm infection.

Const Respond Time

Hosts infected in TR

A worus can potentially beat wormdefenses that require a typical delayof TR after detection to activatefilters over the global Internet.

7

Related Work

Epidemiology Biology

Birth, death, cure rate Internet worm propagation model Non-homogeneous network

ex. wireless network dynamically changing connectivity levels

Email virus Instant messaging worms

Enhance the severity of infection increase infection rate and stealth [30]

hit-list scanning, permutation scanning

[30] N. Weaver, V Paxson, S. Staniford, and R. Cunningham, "A taxonomy of computer worms,“in First Workshop on Rapid Malcode (WORM), 2003.

8

Related Work (cont’d)

Security aspects of P2P Improving the architecture Strengthening the network P2P network itself is used as an attack vect

or Gnutella querying flooding Gnutella protocol weakness

DDoS, privacy violation

9

Background: Gnutella

Searching on Gnutella is done rather randomly with queries being propagated along all paths except the path through which the query was received. This search continues up to a limit of 7 hops determined via a TTL field in the search in the Gnutella packer header.

The Query Routing Protocol was developed in which special nodes, called ultrapeers, maintain a hash table of content available at their respective leaf nodes. They perform a lookup within these tables before forwar

ding a query to the leaves in order to limit unnecessary query messages.

10

Background: P2P virus

P2P viruses propagate by copying themselves to the user's shared directory under a deceptive name.

The users who download these files also copy the virus and thus continue to propagate it on the file sharing network. In the past three years, over 300 such viruses have b

een found. P2P viruses like other worms and viruses strive to s

teal either user information (for identity theft, fraud etc.) or computing resource (to form botnets, DDoS etc.).

11

Worus Propagation Model:diffusion in virus phase

The worus behaves like a normal (P2P) virus in its initial phase when the malcode diffuses through user queries.

The users query for files and receive responses, some of which are infected by the worus malcode. Once the user clicks on an infected file, its P2P clien

t gets infected and starts returning worus infected files in response to

queries from other users. At this time the worus does not commit any ot

her malicious action in order to avoid detection by early warning systems.

12

Worus Propagation Model:model of diffusion in virus phase

)()()'( ttNtN # of infected nodes at time t # of infected nodes during time t’-t

Probability that an uninfectednode is compromised withina time interval

Total # of nodes on the network

# of uninfected nodes at time t

)]([)( 0 tNNpt i

1. at least one of the query responses received is infected2. the user clicks on a infected response

13

Worus Propagation Model:probabilities

1. Query-hit The average number of nodes that can be reached 0-6

hops away from any random node [Table I] Each node response on an average 1% of the (any random) qu

eries they receive.[8] A variable parameter f is used to set the limit on the co

nsideration set. Depending on whether the querying node is a leaf or an

ultrapeer, we can determine the probability that a query-hit in a user's consideration set is malicious, Pm.

2. Selection The author assume the user is equally likely to select a

ny response from the consideration set.

fpp mi 1

14

[Table 1] Reach of ultrapeer and leaf nodes in a Gnutella network

Hops # of nodes reached from ultrapeer

# of nodes reached from leaf

0 25 4

1 156 24

2 1867 1354

3 16465 12317

4 37429 32807

5 0* 32807

6 0* 7266

* Several client reduce the TTL of a query to 4.

15

Probability Counting:monitoring the state of infection

Nodes rely on query and query-hit messages to coordinate every infected node launches the attack at the same time malcode is not detected before it reaches the critical ma

ss Synchronization is achieved through multiple stages of c

oordination Each node in the network can belong to one of three pha

ses based on its knowledge of the spread of infection in the network.

Node learns of more infections on the network it progresses to later phases.

With every successive phase change, the level of synchronization achieved across nodes is higher.

16

Probability Counting:phase 1

When a node is initially infected (by downloading a virus from a malicious peer), it is considered to be in phase 1.

In order to estimate infected node count, each infected node generates R files with b bit long random filenames. All infected nodes then periodically query for r (≦ R)

files each of which has a random filename of length b bits expecting positive responses (query-hits) from other infected nodes.

A node continues to remain in phase 1 until it receives t (≦ r) query-hits.

On receiving t hits, the infected node progresses to phase 2.

The same set of r files is queried periodically.

15 171510

17

6 10

10

6

17

Probability Counting:phase 1 (cont’d)

The authors model the event of receiving a query hit as a Bernoulli trial and compute the probability of getting t hits using a binomial distribution. compute the probability of a node being in

phase 1, i.e., getting less than t hits for the r queries sent

With an estimate of the number of infected nodes in the network currently in phase 1, N1, and the probability of being in phase 1, using recursion

)Pr(1] phasein nodes infected of E[# 1 thitsN

6

6 10

18

Probability Counting:phase 2

In order to achieve synchronization among infected nodes, nodes in phase 2 have to speed up the process of informing other nodes in phase 1 of the current infection count.

The nodes in phase 2 generate R’-R additional files so that nodes in phase 1 would have higher probability o

f receiving a hit. They now query for r’ files, which include the r files u

sed in phase 1. The phase 2 nodes wait for at least t’ (≧ t) hits after

which they can progress to phase 3.

70

15 1011 6

19

Probability Counting:phase 3

Once a node gets into phase 3, it is ready to launch the attack.

Nodes in phase 3 generate R"-R'-R files. This is to help nodes that are still in phase 1 or

phase 2 to quickly transition to phase 3.

4 phasein nodeshost infected of #

3 phasein nodeshost infected of #

2 phasein nodeshost infected of #

1 phasein nodeshost infected of #

x

x

x

x

][][][)(][ xExExEtNxE

400

20

Probability Counting:phase 4

When a node reaches phase 3, the worus turns into a worm and attempts to infect all the nodes by using software vulnerabilities. The vulnerable nodes that get infected through this

process are considered to be in phase 4. Phase 3 and 4 are essentially differentiated by the

mechanism through which they are infected, otherwise they exhibit the same behavior.

21

Probability Counting:analysis

Probability of an infected node getting a query-hit is determined.

Phase 1 The probability of getting t hits from r queries is m

odeled as a binomial distribution

)1( 2/)( bRxRxRxxRep

)Pr(][][

)Pr(

),(1

0

thitsxExE

Pthits

qptrCP

prevnew

t

tt

trtt

q = 1 - p

6 10

22

Probability Counting:analysis

Phase 2 The probability of getting t’ hits from r’ queries i

s modeled as a binomial distribution

Phase 3 (page 18)

Phase 4

)Pr(][][

)Pr(

),(1

0

thitsxExE

Pthits

qptrCP

prevnew

t

tt

trtt

6 15

23

Simulation a P2P Network

Their simulator, CMU-GNS, is built on top of a basic Gnutella simulator GnucNS developed by the makers of the Gnucleus Gnutella client.

The GnucNS simulator was designed as a tool to study the performance of the Gnutella network. supports version 0.6 of the Gnutella protocol support for features such as ultrapeers, node upgra

de and downgrade algorithms, and average connectivity.

24

Simulation Results

They model the spread of infection on a small network of 500 nodes.

The parameters were adjusted so that the final worm attack would be launched when the number of nodes ready to attack was close to 90%.

b R r t R’ r’ t’ R’’

17 15 10 6 85 15 11 500

25

Countermeasures:quick synchronization

1

2

3, 4

26

Countermeasures:dummy hits effect

1

2

3, 4

27

Countermeasures:dummy hits effect

The authors observed that the effectiveness of their attack stems from the fact that a large number of nodes simultaneously initiate the attack.

They suggest that ultrapeers respond with bogus query hits to querying nodes, in order to introduce noise in their counting mechanism and thus throw their synchronization off.

Rating system can effective against known viruses.

28

Future Work

Parameters of diffusion model should be validated.

Nodes may offline/online dynamically.

Model the download pattern of query hits in a consideration set.

29

Conclusions

The authors show that a covert worm can exist in a P2P file-sharing network. the stealth during the initial phase the quick ramp up on an attack state

The authors show that combining two attack methodologies can only serve to increase the magnitude of the attack.

The absence of a command and control channel differentiates Worus from other traditional worms.

30

Comments

Internet worm 通常是直接暴露在網路上 . 但 E-mail worm, instant message worm, P2P worm 會藉由其他的系統隱藏自己 .

File sharing 的用意即在分享檔案本身 , worm 的散佈和一般的檔案分享是無差異的 .

Worm 散佈的觀察是整體性的 , 因此在小區域不見得可以觀察到急遽上升的感染 .

Virus 不見得需要弱點 , 但是 worm 在定義上需要依賴弱點才能攻破 . 因此 network-based 行為模式的偵測方式不一定適用在 virus 上 .

當 attacker 比當 detector 要容易得多 , 讓攻擊流程越接近正常行為越可以避開偵測 .