Machine Learning for Network Anomaly Detection
description
Transcript of Machine Learning for Network Anomaly Detection
Machine Learning for Network Anomaly Detection
Matt Mahoney
Network Anomaly Detection
• Network – Monitors traffic to protect connected hosts
• Anomaly – Models normal behavior to detect novel attacks (some false alarms)
• Detection – Was there an attack?
Host Based Methods
• Virus Scanners
• File System Integrity Checkers (Tripwire, DERBI)
• Audit Logs
• System Call Monitoring – Self/Nonself (Forrest)
Network Based Methods
• Firewalls
• Signature Detection (SNORT, Bro)
• Anomaly Detection (eBayes, NIDES, ADAM, SPADE)
User Modeling
• Source address – unauthorized users of authenticated services (telnet, ssh, pop3, imap)
• Destination address – IP scans
• Destination port – port scans
Frequency Based Models
• Used by SPADE, ADAM, NIDES, eBayes, etc.
• Anomaly score = 1/P(event)
• Event probabilities estimated by counting
Attacks on Public Services
PHF – exploits a CGI script bug on older Apache web servers
GET /cgi-bin/phf?Qalias=x%0a/usr
/bin/ypcat%20passwd
Buffer Overflows
• 1988 Morris Worm – fingerd
• 2003 SQL Sapphire Wormchar buf[100];
gets(buf);
buf stackExploit code
Return Address0 100
TCP/IP Denial of Service Attacks
• Teardrop – overlapping IP fragments
• Ping of Death – IP fragments reassemble to > 64K
• Dosnuke – urgent data in NetBIOS packet
• Land – identical source and destination addresses
Protocol Modeling
• Attacks exploit bugs
• Bugs are most common in the least tested code
• Most testing occurs after delivery
• Therefore unusual data is more likely to be hostile
Protocol Models
• PHAD, NETAD – Packet Headers (Ethernet, IP, TCP, UDP, ICMP)
• ALAD, LERAD – Client TCP application payloads (HTTP, SMTP, FTP, …)
Time Based Models
• Training and test phases
• Values never seen in training are suspicious
• Score = t/p = tn/r where– t = time since last anomaly– n = number of training examples– r = number of allowed values– p = r/n = fraction of values that are novel
Example tn/r
• Training: 0000111000 n/r = 10/2
• Testing: 01223– 0: no score– 1: no score– 2: tn/r = 6 x 10/2 = 30– 2: tn/r = 1 x 10/2 = 5– 3: tn/r = 1 x 10/2 = 5
PHAD – Fixed Rules
• 34 packet header fields– Ethernet (address, protocol)– IP (TOS, TTL, fragmentation, addresses)– TCP (options, flags, port numbers)– UDP (port numbers, checksum)– ICMP (type, code, checksum)
• Global model
LERAD – Learns conditional Rules
• Models inbound client TCP (addresses, ports, flags, 8 words in payload)
• Learns conditional rules
If port = 80 then word1 = GET, POST (n/r = 10000/2)
LERAD Rule Learning
• If word1 = GET then port = 80 (n/r = 2/1)• word1 = GET, HELO (n/r = 3/2)• If address = Marx then port = 80, 25 (n/r =
2/2)
Address Port Word1 Word2
Hume 80 GET /
Marx 80 GET /index.html
Marx 25 HELO Pascal
LERAD Rule Learning
• Randomly pick rules based on matching attributes
• Select nonoverlapping rules with high n/r on a sample
• Train on full training set (new n/r)
• Discard rules that discover novel values in last 10% of training (known false alarms)
DARPA/Lincoln Labs Evaluation
• 1 week of attack-free training data
• 2 weeks with 201 attacks
SunOS Solaris Linux NT
RouterInternet
SnifferAttacks
Attacks out of 201 Detected at 10 False Alarms per Day
0
20
40
60
80
100
120
140
PHAD ALAD LERAD NETAD
Problems with Synthetic Traffic
• Attributes are too predictable: TTL, TOS, TCP options, TCP window size, HTTP, SMTP command formatting
• Too few sources: Client addresses, HTTP user agents, ssh versions
• Too “clean”: no checksum errors, fragmentation, garbage data in reserved fields, malformed commands
Real Traffic is Less Predictable
r (Number ofvalues)
Time
Synthetic
Real
Mixed Traffic: Fewer Detections, but More are Legitimate
0
20
40
60
80
100
120
140
PHAD ALAD LERAD NETAD
Total
Legitimate
Project Status
• Philip K. Chan – Project Leader
• Gaurav Tandon – Applying LERAD to system call arguments
• Rachna Vargiya – Application payload tokenization
• Mohammad Arshad – Network traffic outlier analysis by clustering
Further Reading
• Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks by Matthew V. Mahoney and Philip K. Chan, Proc. KDD.
• Network Traffic Anomaly Detection Based on Packet Bytes by Matthew V. Mahoney, Proc. ACM-SAC.
• http://cs.fit.edu/~mmahoney/dist/