The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason...

20
The Devil and Packet T race Anonymization Authors: Ruoming Pang, Mark Allman, Ver n Paxson and Jason Lee Published: ACM SIGCOMM Computer Communicat ion Review, Volume 36 , Iss ue 1 ,January 2006 Presenter: Ping Wang

Transcript of The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason...

Page 1: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

The Devil and Packet Trace Anonymization

Authors: Ruoming Pang, Mark Allman, Vern Paxson

and Jason Lee

Published: ACM SIGCOMM Computer Communication

Review, Volume 36 , Issue 1 ,January 2006

Presenter: Ping Wang

Page 2: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Overview

Problem How to anonymize the packet traces before rele

ased

Goal Try to preserve as much as possible informatio

n

Page 3: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Background

Why share? Verify the previous results Compare to the competing ideas on the same d

ata Provide a broader view

Who share? NLANR’s PMA packet traces CAIDA’s skitter measurement LBNL’s internal traffic

Page 4: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Background cont.

Available anonymization tools tcpdpriv Ipsumdump tcpurify

Not general enough, and most of them focus on only the header field, primarily IP addresses

Page 5: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

A New tool - tcpmkpub

Provides a general framework for anonymizing traces

It is based on explicit rules for each header field

Page 6: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

An example specification All fileds must be specified with a name, length,

action(“KEEP”, “ZERO”, function)

Page 7: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

An example specification cont.

Supports case statement for the header fields which can vary

Page 8: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Anonymization Policy

Checksums Link layer Network layer Transport layer

Page 9: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Checksums

Replace the original checksum C0 with Cc

For those cannot be verified checksum The packet has been corrupted

Insert “1” The original packet is truncated

Use Cc (note in meta-data)

For those checksum is optional, like UCP, use zero as the checksum

Page 10: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Link layer

Ethernet address is 6 bytes High 3 bytes represent the NIC vendor

Scrambling the entire 6 byte address is not good for research

Scrambling only the lower 3 bytes is not good for the vendor

Remapping these two parts seperately

Page 11: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Network layer (1) – focus on IP address

External addresses Use the prefix-preserving address anonymizatio

n scheme proposed in other paperInternal addresses

not use prefix-preserving address anonymization scheme

Use a prefix which is not used by external addresses within anonymous packet

subnet and host portions are mapped seperately.

Page 12: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Network layer (1)Scanners

Many organizations run a scanner as part of security operation

Trend to hit addresses in some order, like a.b.c.1, a.b.c.2, a.b.c.3, etc.

Keep the scanner’s IP address uniform across the trace, and flag it in the meta-data. And for the destinations of the sans, use different mapping. For exmaple: X1, X2 belongs to one subnet Y

Not involve scanner, map to X’1, X’2 in subnet Y’ Involve scanner, map to X’’1, X’’2 in subnet Z1 and Z2

Page 13: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Network layer (3)

Multicast addresses preserved

Private addresses preserved

Invalid addresses Remap it as the subnet existed, but note this

information in the meta-data.

Page 14: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Transport layer

Preserve both port numbers and sequence numbers

Rewrite timestamp options Transform the timestamp into separate

increasing counters Reason: Clock drift manifest in timestamp

options can be leveraged to fingerprint a physical machine

Page 15: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Testing

Can the transformed traces really be used? Use p0f to do OS fingerprinting Use tcpsum to find the number of packets and b

ytes in both the original and transformed traces

Page 16: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Test cont.

Are the transformed traces really anonymous? Check tcpmkpub’s own log file Look for some string in the anonymized traces

e.g. “Document”, “Setting”, “ConfirmFIleOp” Look for like IP addresses Look for string versions of IP addresses MAC addresses Check timestamps

Page 17: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Paper contributions

Develop a tool, tcpmkpub, for implementing arbitrary anonymization policy;

Use meta-data to help researchers to deal with lost information Invalid checksum, scanner IP

Beyond IP address obfuscation, explore many other dangerous details timestamp, Ethernet addresses, etc.

Page 18: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Paper weaknesses

Only give two experiments to show the anonymized traces are useful

Could have given some anonymization results to make the policy more clear. For example, in the scanner case, addresses a.

b.c.1, a.b.c.2, a.b.c.3, what they would look like if they are involved in scaning traffic, and what if not

Page 19: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Future work

Keep more consistency between the original and anonymized traces

Study online anonymizationProvide a tool which can be easily used for

validation the anonymized tracesProvide a tool for creating an anonymizatio

n policy for tcpmkpub

Page 20: The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason Lee Published: ACM SIGCOMM Computer Communication Review,

Questions?