Northwestern Lab for Internet and Security Technology (LIST)
description
Transcript of Northwestern Lab for Internet and Security Technology (LIST)
![Page 1: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/1.jpg)
Northwestern Lab for Internet and Security
Technology (LIST)Yan Chen
• Router-based Anomaly/Intrusion Detection and Mitigation (RAIDM) Systems • Scalable and Accurate Overlay Network Monitoring and Diagnosis • Wireless and Ad hoc Networking
![Page 2: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/2.jpg)
Northwestern Lab for Internet and Security
Technology (LIST)
Yan Chen
Department of Computer ScienceNorthwestern University
http://list.cs.northwestern.edu
![Page 3: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/3.jpg)
• Internet is becoming a new infrastructure for service delivery– World wide web, – VoIP– Email– Interactive TV?
• Major challenges for Internet-scale services– Scalability: 600M users, 35M Web sites, 2.1Tb/s– Security: viruses, worms, Trojan horses, etc.– Mobility: ubiquitous devices in phones, shoes, etc.– Agility: dynamic systems/network,
congestions/failures
– Ossification: extremely hard to deploy new technology in the core
Our Theme
![Page 4: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/4.jpg)
Projects at LIST
• Global Router-based Anomaly/Intrusion Detection (GRAID) Systems
• Distributed Information Retrieval Systems
![Page 5: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/5.jpg)
Battling Hackers is a Growth Industry!
• The past decade has seen an explosion in the concern for the security of information
• Internet attacks are increasing in frequency, severity and sophistication
• Denial of service (DoS) attacks– Cost $1.2 billion in 2000– Thousands of attacks per week in 2001– Yahoo, Amazon, eBay, Microsoft, White House,
etc., attacked
--Wall Street Journal (11/10/2004)
![Page 6: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/6.jpg)
Battling Hackers is a Growth Industry (cont’d)
• Virus and worms faster and powerful– Melissa, Nimda, Code Red, Code Red II, Slammer …– Cause over $28 billion in economic losses in 2003,
growing to over $75 billion in economic losses by 2007.
– Code Red (2001): 13 hours infected >360K machines - $2.4 billion loss
– Slammer (2003): 10 minutes infected > 75K machines - $1 billion loss
• Spywares are ubiquitous– 80% of Internet computers have spywares installed
![Page 7: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/7.jpg)
The Spread of Sapphire/Slammer Worms
![Page 8: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/8.jpg)
Current Intrusion Detection Systems (IDS)
• Mostly host-based and not scalable to high-speed networks– Slammer worm infected 75,000 machines in <10
mins– Host-based schemes inefficient and user dependent
» Have to install IDS on all user machines !
• Mostly signature-based – Cannot recognize unknown anomalies/intrusions– New viruses/worms, polymorphism
• Statistical detection – Hard to adapt to traffic pattern changes– Unscalable for flow-level detection
» IDS vulnerable to DoS attacks
– Overall traffic based: inaccurate, high false positives
![Page 9: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/9.jpg)
Current Intrusion Detection Systems (II)
• Cannot differentiate malicious events with unintentional anomalies– Anomalies can be caused by network element
faults– E.g., router misconfiguration, signal interference
of wireless network, etc.
• Isolated or centralized systems– Insufficient info for causes, patterns and
prevalence of global-scale attacks
![Page 10: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/10.jpg)
Global Router-based Anomaly/Intrusion Detection
(GRAID) Systems
• Online traffic recording and analysis for high-speed networks– Leverage sketches for data streaming computation
• Online adaptive flow-level anomaly/intrusion detection and mitigation– Leverage statistical learning theory (SLT) adaptively
learn the traffic pattern changes– E.g., busy vs. idle wireless networks, with different
level of interferences, etc.– Unsupervised learning without knowing ground
truth
![Page 11: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/11.jpg)
GRAID Systems (II)
• Integrated approach for false positive reduction– Signature-based detection– Network element fault diagnostics– Traffic signature matching of emerging applications
• Hardware speedup for real-time detection– Collaborated with Gokhan Memik (ECE of NU)– Try various hardware platforms: FPGAs, network
processors
• Scalable anomaly/intrusion alarm fusion with distributed hash tables (DHT)– Automatically distribute alerts with similar
symptoms to the same fusion center for analysis
![Page 12: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/12.jpg)
GRAID Detection Sensor• Attached to a router or access point as a black
box• Edge network detection is particularly powerful
Router
LAN
Internet
Switch
LAN
(a)
Router
LAN
Internet
LAN
(b)
GRAID sensor
scan
po
rtsc
an
port
Splitter
Router
LAN
Internet
LAN
(c)
Splitter
GR
AID
sen
sor
Switch
Switch
Switch
Switch
Switch
GRAIDsensor
GRAIDsensor
Original configuration Monitor each port
separately
Monitor aggregated
traffic from all ports
![Page 13: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/13.jpg)
GRAID Sensor
ArchitectureReversiblek-ary sketch monitoring
Filtering
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords
Per-flow monitoring
Streaming packet data
Normal flows
Suspicious flows
Intrusion or anomaly alarms to fusion centers
Keys of suspicious flows
Keys of normal flows
Data path Control pathModules on the critical path
Signature-based detection
Traffic profile checking
Statistical detection
Part ISketch-basedmonitoring & detection
Part IIPer-flowmonitoring & detection
Modules on the non-critical path
Network fault detection
![Page 14: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/14.jpg)
Scalable Traffic Monitoring and Analysis - Challenge
• Potentially tens of millions of time series ! – Need to work at very low aggregation level (e.g., IP
level)» Changes may be buried inside aggregated traffic
– The Moore’s Law on traffic growth …
• Per-flow analysis is too slow or too expensive– Want to work in near real time
• Existing approaches not directly applicable– Mostly focus on heavy-hitters
![Page 15: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/15.jpg)
Sketch-based Change Detection
(ACM SIGCOMM IMC 2003, 2004)
• Input stream: (key, update)
Sketchmodule
Forecastmodule(s)
Change detectionmodule
(k,u) … SketchesError
Sketch Alarms
• Report flows with large forecast errors
• Summarize input stream using sketches
• Build forecast models on top of sketches
![Page 16: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/16.jpg)
Sketch• Probabilistic summary of data streams
– Originated in STOC 1996 [AMS96]– Widely used in database research to handle
massive data streams
Space Accuracy
Hash table Per-key state 100%
Sketch Compact With probabilistic guarantees (better for larger values)
![Page 17: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/17.jpg)
K-ary Sketch
• Array of hash tables: Tj[K] (j = 1, …, H)
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)
• Update (k, u): Tj [ hj(k)] += u (for all j)
![Page 18: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/18.jpg)
K
KsumkhT jjj /11
/)]([median
K-ary Sketch (cont’d)• Estimate v(S, k): sum of updates for key k
compensatefor signal loss
v(S, k) + noise
v(S, k)/K + E(noise)
boostconfidence
unbiased estimator of v(S,k) with low variance
1
j
H
0 1 K-1…
……
hj(k)
hH(k)
h1(k)
![Page 19: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/19.jpg)
Forecast Model: EWMA•Sketches are linear (Can combine sketches)
•Compute forecast error sketch: Serror
=
Sforecast(t) Sobserved(t-1) Sforecast(t-1)
= -
Serror(t-1) Sobserved(t-1) Sforecast(t-1)
•Update forecast sketch: Sforecast
![Page 20: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/20.jpg)
• Evaluated with tier-1 ISP trace and NU traces• Scalable
– Can handle tens of millions of time series
• Accurate– Provable probabilistic accuracy guarantees– Even more accurate on real Internet traces
• Efficient – For the worst case traffic, all 40 byte packets:
» 16 Gbps on a single FPGA board» 526 Mbps on a Pentium-IV 2.4GHz PC
– Only less than 3MB memory used
• Patent filed
Evaluation of Reversible K-ary Sketch
![Page 21: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/21.jpg)
Remaining Challenges
• Reversible sketch to infer the culprit flows (ACM SIGCOMM IMC 2004)
• Hierarchical and multi-dimensional sketch
• Detecting distributed and insidious attacks with sketch
![Page 22: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/22.jpg)
GRAID Sensor
ArchitectureReversiblek-ary sketch monitoring
Filtering
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords
Per-flow monitoring
Streaming packet data
Normal flows
Suspicious flows
Intrusion or anomaly alarms to fusion centers
Keys of suspicious flows
Keys of normal flows
Data path Control pathModules on the critical path
Signature-based detection
Traffic profile checking
Statistical detection
Part ISketch-basedmonitoring & detection
Part IIPer-flowmonitoring & detection
Modules on the non-critical path
Network fault detection
![Page 23: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/23.jpg)
Statistical Anomaly Detection
• Online statistical detection with sketches• Applying Statistical Learning Theory (STL)
– Use Hidden Markov Model (HMM) to adaptively learn the parameters
• Focus on two major intrusions: denial of service (DoS) attacks and port scanningMonitor traffic with multiple sketches – With different keys
» (Source IP, Dest IP)» (Source IP, Dest port)» (Dest IP, Dest port)
– For each key, record the number of unconnected TCP requests: SYN – SYN/ACK
![Page 24: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/24.jpg)
Intrusion Mitigation
Attacks detected MitigationDenial of Service (DoS), e.g., TCP SYN flooding
SYN defender, SYN proxy, or SYN cookie for victim
Port Scan and worms Ingress filtering with attacker IPVertical port scan Quarantine the victim machineHorizontal port scan Monitor traffic with the same
port # for compromised machine
Spywares Warn the end users being spied
HORIZONTAL
PORT NUMBER
SOURCE IP
BLOCK
VERTICAL
![Page 25: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/25.jpg)
GRAID Sensor
ArchitectureReversiblek-ary sketch monitoring
Filtering
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords
Per-flow monitoring
Streaming packet data
Normal flows
Suspicious flows
Intrusion or anomaly alarms to fusion centers
Keys of suspicious flows
Keys of normal flows
Data path Control pathModules on the critical path
Signature-based detection
Traffic profile checking
Statistical detection
Part ISketch-basedmonitoring & detection
Part IIPer-flowmonitoring & detection
Modules on the non-critical path
Network fault detection
![Page 26: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/26.jpg)
Network Diagnosis and Fault Location
• Infrastructure ossification led to thrust of overlay applications
• Traceroute gives hop-by-hop round-trip latency– Asymmetric routing– Can’t get hop-by-hop loss rate !
• Network tomography– Infer the properties of links from end-to-end
measurements– Limited measurements -> under-constrained system,
unidentifiable links
– Existing work uses various constraints and assumptions
» Tree-like topology» The number of lossy links is small
1 2
1’1
![Page 27: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/27.jpg)
Our Approach: Virtual Links
•Minimal link sequences (path segments) whose loss rates uniquely identified–Locate the faults to certain link(s)
•The first lower-bound on the network tomography granularity
•Use algebraic scheme to find virtual links–Leverage our work on overlay network
monitoring (ACM SIGCOMM IMC 2003, ACM SIGCOMM 2004)
![Page 28: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/28.jpg)
GRAID Sensor
ArchitectureReversiblek-ary sketch monitoring
Filtering
Sketch based statistical anomaly detection (SSAD)
Local sketch records
Sent out for aggregation
Remote aggregatedsketchrecords
Per-flow monitoring
Streaming packet data
Normal flows
Suspicious flows
Intrusion or anomaly alarms to fusion centers
Keys of suspicious flows
Keys of normal flows
Data path Control pathModules on the critical path
Signature-based detection
Traffic profile checking
Statistical detection
Part ISketch-basedmonitoring & detection
Part IIPer-flowmonitoring & detection
Modules on the non-critical path
Network fault detection
![Page 29: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/29.jpg)
Intrusion/anomaly Alarm Fusion
• Individual IDS has bad accuracy due to limited view
• Crucial to collect information from multiple vantage points – distributed IDS (DIDS)– Each IDS generate local symptom report, send to
sensor fusion center (SFC)
• Help understand the prevalence, cause and patterns of global-scale attacks
• Existing DIDS– Centralized fusion– Distributed fusion with unscalable
communication
![Page 30: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/30.jpg)
GRAID Sensor Interconnection
• Though Cyber Disease DHT (distributed hash table) for alarm fusion– Scalability– Load balancing– Fault-tolerance– Intrusion
correlation
Internet
IDSIDS + SFC
GRAID Coverage
AttackInjected
AttackInjected
CDDHTMesh
![Page 31: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/31.jpg)
Basic Operations of CDDHT
• put (disease_key, symptom report)– Send report to SFC
• attack_info = get (disease_key)– Query about certain attacks from SFC
• Each operation only O(n) hops – n is the total number of nodes in CDDHT
![Page 32: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/32.jpg)
CDDHT: Disease Key Design
Intrusion ID Characterization Field(s)
DoS Attack 0 Victim IP (subnet)
Scans 1 0 (for vertical & block scan)
Source IP address
Destination IP (for vertical scan)
0 (for block scan)
1 (for horizontal & coordinated scan)
Scan port number
Source IP (for horizontal scan)
0 (for coordinated scan)
Viruses/Worms 2 0 (for known virus/worm) Worm ID
1 (for unknown virus/worm) Destination port number
![Page 33: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/33.jpg)
Other Challenges of CDDHT
• Load balancing• Supporting complicated queries
– E.g., aggregate queries
• Attack resilience– OK to have some IDS sensors compromised– What about SFCs?
![Page 34: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/34.jpg)
Research methodologyCombination of theory, synthetic/real trace
driven simulation, and real-world implementation and deployment
![Page 35: Northwestern Lab for Internet and Security Technology (LIST)](https://reader036.fdocuments.us/reader036/viewer/2022062723/56813d6e550346895da74d36/html5/thumbnails/35.jpg)
Conclusion for GRAID Systems
• Online traffic recording and analysis on high-speed networks
• Online statistical anomaly detection• Integrated approach for false positive
reduction– Signature-based detection– Network element fault diagnostics– Traffic signature matching of emerging
applications
• Hardware speedup for real-time detection• Scalable anomaly/intrusion alarm fusion with
distributed hash tables (DHT)