Computer System Intrusion Detection: A Survey Anita K. Jones & Robert S. Sielken Presented by...
-
Upload
alexia-lamb -
Category
Documents
-
view
215 -
download
0
Transcript of Computer System Intrusion Detection: A Survey Anita K. Jones & Robert S. Sielken Presented by...
Computer System Intrusion Detection: A Survey
Anita K. Jones & Robert S. SielkenPresented by Peixian Li (Rick)
For CS551/651 Computer Security
Overview
• Why IDS
• IDS Overview
• Anomaly Detection
• UNM Pattern Matching
• Misuse Detection
• Extended to Networked Systems
• Conclusion
Why IDS ?
• In defending network resources, we have– Firewalls– Encryption technology– Authentication devices– Vulnerability checking tools– Others …
Why IDS ? -2-
• But computer system is still susceptible– Due to unknown system flaws
– Due to known system flaws better stay than gone because of functionality or cost.
– Due to social engineering tricks
• A recent news– An 18 year old boy broke into a eCom web site
– Thousands customer's credit info was stolen
– Including Bill Gates’
Why IDS ? -3-
• Based on the fact that– Penetrations always exist
• We need– A second line of defense
– A mechanism to detect the penetrations and the attempting intrusions
– Which is in the form of an Intrusion Detection System
• Even attempts are guaranteed to fail– IDS can still help us to find out potential vulnerabilities
Approaches
• Anomaly Detection– Defines and
Characterizes correct static form and acceptable dynamic behavior
– Detects anomalous changes or behaviors which may not be intrusions
• Misuse Detection– Characterizes known
ways to penetrate a system as patterns
– Monitors for explicit patterns which are known to be intrusions
Anomaly vs. Misuse
• Anomaly Detection– May have high rate of
false alarms
– Can detect novel attacks
– Normal databases are relatively more stable
• Misuse Detection– May miss novel attacks
– Complexity grows as the number of well-known attacks grows
– Difficult to keep them updated as the catalog of attacks grows
Three Generations
• First Generation– The emphasis was on single computer systems– O/S audit records were post-processed
• Second Generation– Extended and scaled to address distributed
system.– More sophisticated– Primitive real-time alerts became possible
Three Generations -2-
• The Third Generation– Further extended to address loosely coupled
networks, such as LAN
• Two Primary Challenges– Tracking users as they move through nodes– Managing the data as the size of the network
scales up
What Makes A Good IDS ?
• Manage the volume of data, communications, and processing in large scale networks
• Increase coverage, i.e. miss ALAP
• Decrease false alarms
• Detect intrusion in progress
• React in real-time
Basic Components
• Focus– Which entity’s self or which elements of the
entity do we try to focus on– Definitions of events or behavior of interest
• Representation– How to represent signatures effective and
efficient
Basic Components -2-
• Initial Database– Initial behavior profile or normal database– Which can characterize behavior of interest– Which can represent entity of interest
• Detection Algorithm– Statistical processing techniques for divining
the difference between normal and anomalous behavior (effective and efficient)
Anomaly Detection
• Static– Assume that a portion
of the system remain constant
– System code and portion of system data
– Represented as a binary bit string or a set of such string
– A single bit change
• Dynamic– Assume that system’s
behavior is stable
– Include a definition behavior
– Represented as a sequence of distinct events
– Empirical threshold
Static Anomaly Detection
• How does it work?– Defines the desired state of the system using
static bit strings– Archives a representation of the state– Periodically compares the current state and the
archived state– Any difference signals an error
Signature
• Storing and comparing actual bit strings representation is quite costly
• Compressed representation is called signature
• Signatures include checksums, message-digest algorithms and hash functions
• Meta-data: knowledge about the structure
Some Actual Systems
• Tripwire– A file integrity checker– Uses signatures as well as Unix file meta-data
• Virus Checkers– Uses actual bit string inserted by the virus– Strings are short, thus uncompressed
• Self-Noself– Unlike Tripwire, the Self-Nonself signatures are for
unwanted string values
Dynamic Anomaly Detection
• Before Running
• For each individual entity, IDS creates a base profile to characterize normal, acceptable behavior– Entities can be: users, workstations, remote
hosts, or applications– Behaviors can be: preferred choices, resources
consumed, representative sequences of actions
Dynamic Anomaly Detection -2-
• Two ways to build up base profiles– By synthetically running the system
• Can it represents the real system?
– By observing normal user behavior over a sufficiently long time
• Can we be sure that no intrusion undertaking during the period of time?
Dynamic Anomaly Detection -3-
• When Running
• Observes events related or attributed to the entity
• Incrementally builds a current profile
• Some operate in real-time, or near real-time, or directly observe the events during occurrence
Dynamic Anomaly Detection -4-
• Static detections do not care the degree of the difference
• Dynamic detections do care
• Comparison is based on empirically determined thresholds
• Only those mismatch over the thresholds will result in alert
UNM Pattern Matching
• Focus– Individual application and its behavior
– E.g. Sendmail
• Representation– Uses privileged system call sequences to represent an
application’s behavior
– E.g. (open, read, mmap), (read, mmap, mmap)…
– Sequence length usually between 3 to 6
UNM Pattern Matching -2-
• Initial Database– Built either by synthetically running the application or
by observing its real running
– Normal sequences are stored as forest in normal database to save space
• Detection Algorithm– Largest Minimum Hamming Distance
– Normalized LMHD
– Local Frame Count
UNM Pattern Matching -3-
Total seqs in DB
0
100
200
300
400
500
600
700
800
900
To
tal se
qs s
ca
nn
ed
0
46
82
53
95
68
47
82
70
94
81
10
43
8
11
66
5
13
19
7
16
10
0
18
32
4
19
56
4
20
71
2
21
85
7
23
86
2
24
91
5
Total seqs in DB
UNM Pattern Matching -5-
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8
Anomalous
% Anomalous
Largest Min HD
Maximum LFC
Misuse Detection
• Remember known technique
• Monitor the system if any of those known technique presents
• Intrusion scenario – A description of a fairly precisely know kind of intrusion which usually a sequence of actions
Rule-Based vs. State-Based
• Rule-Based– Encode scenarios as a set of
rules, where rules reflect the sequence of actions
– Fact base is a collection of assertions based on accumulated data
– Rule base contains the rules that describe known intrusion scenarios
– Rule-face binding
– Rule firs
• State-Based– Attribute-value pairs
characterize systems states of interest
– Actions are defined as transitions between states
– Monitor the actions and then change the state
– If compromised state reached, the intrusion happens
Extended to Networked Systems
• New situations– Cooperative intrusions are more frequent– Intruder(s) use multiple nodes in an attempt to
• Parallel actions to make intrusion faster
• Distribute actions to disguise their activities
• New elements in Network IDS– Include network traffic as part of behavior– Data sharing and communication
Centralized vs. Decentralized
• Centralized Analysis– Audit data is collected on
individual systems
– Reported to some centralized location
– Intrusion detection analysis is performed there
– Don’t work well for large network due to sheer volume of data
– Need data translation in heterogeneous systems
• Decentralized Analysis– Distributed audit data
collection
– Distribute intrusion detection analysis
– Works well for large networks because less data shared between different components
– Can eliminate translation problem by grouping homogeneous systems
Partition
• In decentralized system, entire system is divided into smaller domains for the purpose of communication
• Partition can base on– Geography– Administrative control– Collections of similar software platforms– Anticipated types of intrusions
• Still centralized within a domain
Vulnerabilities
• Intrusion detection software themselves are not inherently survivable and need protection also
• Initialization will be flawed if the intrusions are present
• Audit data must be timely available• IDS should not compete resource with the
rest of the system