Research and Project Overview Presented by: Yevgeniy Gershteyn Larisa Perman 05/15/2003 Anomaly...

Research and Project Overview

Presented by:

Yevgeniy Gershteyn

Larisa Perman

05/15/2003

Anomaly Intrusion Detection

2

Outline

Research: Intrusion Detection Systems (IDS) Anomaly Intrusion Detection Systems (ADS) Attacks Neural Networks Applications

Project: User Profiler for UNIX Goal Design Hopfield NN Demo

3

Introduction

Intrusion detection systems (IDS) are special security mechanisms to prevent the computer systems and networks from stream of harmful activities.

Intrusion detection systems are devided into misuse or signature based detection anomaly detection

To determine system security risk, system monitoring is used to screen users activities (log and usage files).

To classify obtained data to detect intrusion attempts neural networks can be applied because of their ability to learn and generalize

4

Misuse Detection

Detects only known attacks based on attack signatures (patterns).

Works quite accurately with no false alarms to be raised.

Cannot detect previously unknown attacks which signatures are not available.

5

Anomaly Detection

Can detects previously unknown attacks. Looks for differences between current activities and

statistical model of past behavior. Focusing on normal system behaviors, rather than

attack behaviors. Normal behaviors have a profile that was created

during the training period in the absence of attacks in the experiential system. Machine learning techniques are used for this.

Current activities of the system which are different from a profile are irregular behaviors and classified as potential intrusions.

6

Anomaly Detection (drawbacks)

Unable to identify and classify the specific type of detected attacks.

Systems depend of the type of attack they has been looking for.

Generate a large amount of false alarms. Cannot distinguish whether irregular activities are

intrusions or unusual but legal. Non-intrusion can be labeled as intrusion.

If the attack occurs while the creating a normal behavior profile, this intrusive behavior will be a part of the normal behavior.

7

Anomaly Detection (cont.)

Depends on the source of input data anomaly based intrusion detection systems can be divided onto host based and network based.

Host based systems focus on the users (or programs) activities at hosts and use different methods to build a normal behaviors profile.

Network based systems focus on the packets which are transmitted over the network. Based on the type of used data they are classified into traffic and application.

8

Anomaly Detection (cont.)

Traffic models are used to monitor the stream of packets to determine particular parameters to detect varieties of attacks, such as port scans or denial-of-service (DOS) attacks.

Application models integrate application specific knowledge to detect attacks on a particular service on target machine, such as Remote-to-Local (R2L) attacks.

9

Attacks

Denial of service attacks attempt to render a system or service unusable to legitimate users.

Probing/surveillance attempt to map out system vulnerabilities and usually serve as a launching point for future attacks.

Remote to local attempt to gain local account privilege from a remote and unauthorized account or system.

User to root attempt to elevate the privilege of a local user to root (or super user) privilege.

10

Mimicry Attacks – Hacker’s Tips

Slip under the radar is when attacker is trying to avoid causing any change whatsoever in the observable behavior of the application.

However, with modern systems it is hard to make huge harm to the system, because it is hard to do something in the system without system calls.

Be patient is simply to be patient: wait passively for a time when the malicious sequence will be accepted by the IDS as normal behavior, and then pause the application and insert the malicious sequence.

In this case attacker must wait long time to get chance to do destructions of the system, but it is possible and may have dangerous outcomes.

11

Hacker’s Tips (cont.)

Be patient, but make your own luck is one way the attacker can improve upon passive patience is by loading the dice.

The attacker can look for the most favorable path of execution and nudge the application into following that path.

For this technique attacker should learn behavior of the intrusion detection system and identify this paths.

Replace system call parameters is another observation is that most schemes completely ignore the arguments to the system call.

System call: open("/lib/libc.so", O_RDONLY) Looks indistinguishable (to the IDS) from the malicious call:

open("/etc/shadow", O_RDWR) Prevention of this attack is not widely used in intrusion

detection system, so more attention needed for especially this method of the attack.

12

Hacker’s Tips (cont.)

Insert no-ops is another observation If there is no convenient way to insert the given

malicious sequence into the application's system call stream, we can often vary the malicious sequence slightly by inserting “no-ops" into it.

In this kind of attack, does not indicate what the purpose of the attacker is in the system.

Generate equivalent attacks is a way of generating variations on the malicious sequence without changing its effect gives the attacker an extra degree of freedom in trying to evade detection.

13

Main Goal of ADSs

The main goal of latest researches on ADS is maximize the amount of detected intrusions and minimize the rate of false alarms.

It is necessary to know what characteristics of the system were learned while the profile was created and what type of the information is in there.

It is important to choose a suitable set of features in order to enhance the structure and contents of the profile.

It is helpful to apply potential resources of a profile to determine and regulate its size and complexity to make it suitable (or comprehensive) depends of the system needs.

14

Why Neural Networks?

Neural Networks is used because of their ability to learn and generalize.

able to classify inputs from exposure to a set of training inputs and application of well defined learning rules

can produce reasonable classifications for novel inputs (assuming the network has been trained well)

Major issues: how to encode the data for input to the network what network topology should be used how to train the networks how to perform anomaly detection with a supervised

training algorithm what to do with the data produced by the neural

network.

15

IDS and Neural Networks

Training Mode: Supervised Learning:

Covers entire domain because instead of looking for a specific match, it looks for a pattern match.

Unsupervised Learning: Tries to find patterns within a dataset and seek to

group them according to the most relevant features.

Useful for large volumes of raw data with little (or no) knowledge of the interrelations between different field in the vector.

16

IDS and Neural Networks

Feedback Mechanism Feed-forward Networks (non-recurrent)

Draws conclusion solely on its own data vector: No other data or previous results are used and no system memory.

Suitable for problems where the data vectors are independent of each other.

Simple to train and deploy. Recurrent Networks

Result of previous decision may influence subsequence decision.

Data sets need to be carefully designed to make appropriate relations between different vectors.

17

Research Efforts

MLP – the most common NN used in IDS Based on monitored data. Anomaly Detection on user behavior analysis. As alternative approach to signature based misuse

detection. Concern: when data set becomes big, accuracy goes down.

Self Organizing Map – the most recent approach Applied to user behavior, applications and process analysis. Accuracy for big data sets as good as for small.

Trend to use multiple technologies One technique to create data set, and NN to learn and

generalize, analyze and update data set.

18

MLP in UNIX Security

The project goal: Utilize NN to problem of user anomaly detection.

The system used MPL network to attempt to train and detect anomalies in real-time.

The system was designed to continuously modify and adapt to its user’s normal behavior.

The system’s monitoring domain: User Activity times, User login hosts, User foreign hosts,

Command set, and CPU usage. It successfully detect a wide variety of anomalies on a

student host in a university setting.

19

Host Based ID using SOM

Examine session data by users on a UNIX system in search behavioral anomalies.

Collects the following session data for analysis: User group, Connection type, Connection source,

Connection time The collected data is preprocessed and normalized

for presentation to SOM analysis engine. Engine has two levels

3-map tier which summarizes the first three input domains with respect to time.

Correlates the conclusions of the first tier. Result: each user group can be examined and

associated with particular user behavior.

User Profiler for UNIX

Project Overview

21

Goals Design the system to determine users whose

activities are higher than their authorized level. Develop an algorithm to evaluate and analyze the

dissimilarity between the assigned pattern and the obtained outcome for the user.

Apply Neural Network techniques to solve the task. Chose one of the existing Neural Network techniques

and build the Neural Network to perform these tasks. Train and test the Neural Network with the suitable

data set. Implementation of the designed system.

Implement Neural Network Develop a Profiler

parse user’s data from UNIX log files prepare the input for Neural Network analyze Neural Network results

22

Design

Classified UNIX commands into five classes considering how harmful they are.

Created 3 main users’ patterns: student { 1, 1, 0, 0, 0 } faculty { 1, 1, 1, 0, 0 } admin { 1, 1, 1, 1, 1 }

Since actual data wasn’t available at that time, data set was simulated.

23

Neural Network

Single layer Hopfield Network with 5 interconnected neurons

Note: Hopfield Network was described by Dr. Reznik in Topic 5

32

54

1

24

Hopfield Network

Input X, Output Y, Weighted Matrix W Stored patterns Y1, Y2, Y3

Calculate weighted matrix:

Activate Net by applying the input vector X Calculate the actual output Y Compare the result with appropriate pattern

M

W = ∑ YmYTm – M * I

m=1

25

Implementation

DEMO Source Code:

HopfieldNetwork.java Matrix.java Profiler.java ProfilerException.java SystemMaps.java Tools.java UserInfo.java

Data: system.dat users.dat

26

Conclusion

MLP is widely used network. However, for big input set it is not useful.

Recurrent networks overcome MLP, since they are self-learning.

Our experiment with Hopfield Network just proofed that self-learning is a good technique for IDS.

In our opinion, unsupervised networks will push down supervised learning.

27

References A. Ghosh, A. Schwartzband, and M Schatz. Learning Program Behavior Profiles

for Intrusion Detection. Preceding of the Workshop on Intrusion Detection and Network Monitoring, April 1999

K. Tan, The Application of Neural Networks to UNIX Computer Security. http://citeseer.nj.nec.com/tan95application.html (2003)

A. Ghosh and A. Schwartzband. A Study in using Neural Networks for Anomaly and Misuse Detection. Preceding of the 8th USENIX Security Symposium, August 1999

D. Wagner, P. Soto Mimicry attacks on host-based intrusion detection systems. Proceedings of the 9th ACM conference on Computer and Communications Security, Washington, DC, USA. Pages: 255 – 264, 2002

R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Tiwari, H. Yang, and S. Zhou. Specification-based anomaly detection: a new approach for detecting network intrusions. In Proceedings of the 9th ACM Conference on Computer and Communications Security, pages 265–274, 2002.

R Christopher Krügel, Thomas Toth, and Engin Kirda. Service specific anomaly detection for network intrusion detection. In Proceedings of the 17th ACM Symposium on Computer Applied Computing, pages 201–208, 2002.

Alexandr Seleznyov and Oleksiy Mazhelis. Learning temporal patterns for anomaly intrusion detection. In Proceedings of the 17th ACM Symposium on Computer Applied Computing, pages 209–213, 2002.

28

Questions

Thank you

Research and Project Overview Presented by: Yevgeniy Gershteyn Larisa Perman 05/15/2003 Anomaly...

Documents

Transcript of Research and Project Overview Presented by: Yevgeniy Gershteyn Larisa Perman 05/15/2003 Anomaly...