Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to...

36
Using Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose Gerald M Masson Johns Hopkins University Information Security Institute

Transcript of Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to...

Page 1: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Using Visual Motifs to Classify Encrypted Traffic

VizSEC'06 - November 3, 2006

Charles V WrightFabian Monrose

Gerald M Masson

Johns Hopkins UniversityInformation Security Institute

Page 2: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Traffic Classification: Why?● To detect intrusions or malware

– Is your mail server hosting a phishing website?(Are you sure?)

● To detect misuse by legitimate users– File sharing– Chat, Instant Messaging

Page 3: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Traffic Classification: Why?● Port Numbers are not reliable

– They can be changed at will by the end hosts

● Increased use of cryptography precludes inspection of packet payloads– Good: Hackers can't get our passwords.– Bad: Network admins have less info to work with

Page 4: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Traffic Classification: How?● Manually?

– tcpdump output? Ethereal/Wireshark?

Page 5: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Traffic Classification: How?● Manually? No.

– tcpdump output? Ethereal/Wireshark?● Machine Learning

– Text classification [ZP00] [MP05] [Dre06] [Ma06]

– Decision Trees [EBR03]

– Naïve Bayes [MZ05]

– Hidden Markov Models [WMM04] [WMM]

Page 6: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Traffic Classification: How?● Manually? No.

– tcpdump output? Ethereal/Wireshark?● Machine Learning

– [ZP00] [EBR03] [WMM04] [MP05] [MZ05] [Dre06] [Ma06] [WMM]

● Visually– Look for distinctive visual motifs in the patterns

produced by packets on the wire

Page 7: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Core observation of this work:

Application protocols behave differentlyand thus look different from each other

on the wire.

Page 8: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Core observation of this work:

Application protocols behave differentlyand thus look different from each other

on the wire.

Even when encrypted using SSL or TLS.

Page 9: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Application to Traffic Classification

● We can use these differences to distinguish between common application protocols in the traffic that we see on our networks– Quickly and Easily– Without port numbers– Without packet payloads

Page 10: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

What does a TCP connection look like?

from

clie

ntfr

om s

erve

r

Example: HTTP

Page 11: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

What does a TCP connection look like?

HTTP Request

TCP 3-wayHandshake

from

clie

ntfr

om s

erve

r Data Transfer from Server to Client

Example: HTTP

Page 12: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

What does a TCP connection look like?

TCP 3-wayHandshake

from

clie

ntfr

om s

erve

r

Data Transfer from Client to Server

SMTP Handshaking(EHLO, RCPT TO, etc.)

SMTP GOODBYE

TCP FIN

Example: SMTP

Page 13: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once

Example: HTTP

from

clie

ntfr

om s

erve

r

n = 1

Page 14: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once

Example: HTTP

from

clie

ntfr

om s

erve

r

n = 2

Page 15: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once

Example: HTTP

from

clie

ntfr

om s

erve

r

n = 3

Page 16: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once

Example: HTTP

from

clie

ntfr

om s

erve

r

Yuck!n = 50

Page 17: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once - heat maps

Example: HTTP

dark spots - very few packets

bright spots -lots of packets

from

clie

ntfr

om s

erve

r

Page 18: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Viewing many similar TCP connections at once – heat maps

Example: HTTP

TCPhandshake

from

clie

ntfr

om s

erve

rHTTPrequests

HTTP response

Data from server

ACKs fromclient

Page 19: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Classifying traffic with heat maps and visual motifs

HTTP

AIM

SMTP

HTTP

Page 20: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Classifying traffic with heat maps and visual motifs

HTTP

AIM

SMTP

SSH

Page 21: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Does this look like HTTP?

Page 22: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Or more like SMTP?

Page 23: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Limitations● The previous graphs illustrate time-dependent

properties of the application protocols

● They also cover a very short time span

● Long-lived, free-form protocols like SSH may be better characterized by taking a different view of the data

Page 24: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Steady-State Properties● We assume these don't change over the life of

the connection● Look at individual packets (unigrams)

– How big is the packet?– How long since the previous packet?

● Look at pairs of consecutive packets (bigrams)

Page 25: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Unigram Frequencies: HTTPfr

om c

lient

from

ser

ver

Page 26: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Unigram Frequencies

HTTP

AIM

SMTP

SSH

Page 27: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigram Frequencies

HTTP

AIM

SMTP

SSH

Page 28: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigram Frequencies: HTTPfr

om c

lient

from

ser

ver

from server from client

Page 29: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigram Frequencies: SMTPfr

om c

lient

from

ser

ver

from server from client

Page 30: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigram Frequencies: AIMfr

om c

lient

from

ser

ver

from server from client

Page 31: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigram Frequencies: SSHfr

om c

lient

from

ser

ver

from server from client

Page 32: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Bigrams in 3D

Page 33: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Future Work● Work is in progress to build an interactive GUI

application for analyzing packet traces– Open Source release planned for later this

academic year● We're also exploring ways to integrate Machine

Learning with Visualization more effectively

Page 34: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Acknowledgments● Many thanks to the developers of Numerical

Python and the Python matplotlib package

● Thanks also to the Statistics Group at GMU and to Pang et al. at LBNL for providing access to their packet traces

Page 35: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

Thanks!● Questions?

Page 36: Using Visual Motifs to Classify Encrypted Trafficcwright/vizsec06-slides.pdfUsing Visual Motifs to Classify Encrypted Traffic VizSEC'06 - November 3, 2006 Charles V Wright Fabian Monrose

References● [Dre06] H. Dreger, A. Feldmann, M. Mai, V. Paxson, and R. Sommer. Dynamic Application-Layer Protocol

Analysis for Network Intrusion Detection. USENIX Security 2006.

● [EBR03] J. Early, C. Brodley and C. Rosenberg. Behavioral Authentication of Server Flows. ACSAC 2003.

● [Ma06] J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G.M. Voelker. Unexpected Means of Protocol Inference. IMC 2006.

● [MP05] A. Moore and D. Papagiannaki. Toward the Accurate Identification of Network Applications. PAM 2005.

● [MZ05] A. Moore and D. Zuev. Internet Traffic Classification Using Bayesian Analysis Techniques. ACM SIGMETRICS, June 2005.

● [WMM04] C. Wright, F. Monrose, and G.M. Masson. HMM Profiles for Network Traffic Classification (Extended Abstract). VizSEC/DMSEC 2004.

● [WMM] C.V. Wright, F. Monrose, and G.M. Masson. On Inferring Application Protocol Behaviors in Encrypted Network Traffic. JMLR Special Topic on Computer Security. (to appear)

● [ZP00] Y. Zhang and V. Paxson. Detecting Back Doors. USENIX Security 2000.