The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason...
-
Upload
chester-stephens -
Category
Documents
-
view
214 -
download
0
Transcript of The Devil and Packet Trace Anonymization Authors: Ruoming Pang, Mark Allman, Vern Paxson and Jason...
The Devil and Packet Trace Anonymization
Authors: Ruoming Pang, Mark Allman, Vern Paxson
and Jason Lee
Published: ACM SIGCOMM Computer Communication
Review, Volume 36 , Issue 1 ,January 2006
Presenter: Ping Wang
Overview
Problem How to anonymize the packet traces before rele
ased
Goal Try to preserve as much as possible informatio
n
Background
Why share? Verify the previous results Compare to the competing ideas on the same d
ata Provide a broader view
Who share? NLANR’s PMA packet traces CAIDA’s skitter measurement LBNL’s internal traffic
Background cont.
Available anonymization tools tcpdpriv Ipsumdump tcpurify
Not general enough, and most of them focus on only the header field, primarily IP addresses
A New tool - tcpmkpub
Provides a general framework for anonymizing traces
It is based on explicit rules for each header field
An example specification All fileds must be specified with a name, length,
action(“KEEP”, “ZERO”, function)
An example specification cont.
Supports case statement for the header fields which can vary
Anonymization Policy
Checksums Link layer Network layer Transport layer
Checksums
Replace the original checksum C0 with Cc
For those cannot be verified checksum The packet has been corrupted
Insert “1” The original packet is truncated
Use Cc (note in meta-data)
For those checksum is optional, like UCP, use zero as the checksum
Link layer
Ethernet address is 6 bytes High 3 bytes represent the NIC vendor
Scrambling the entire 6 byte address is not good for research
Scrambling only the lower 3 bytes is not good for the vendor
Remapping these two parts seperately
Network layer (1) – focus on IP address
External addresses Use the prefix-preserving address anonymizatio
n scheme proposed in other paperInternal addresses
not use prefix-preserving address anonymization scheme
Use a prefix which is not used by external addresses within anonymous packet
subnet and host portions are mapped seperately.
Network layer (1)Scanners
Many organizations run a scanner as part of security operation
Trend to hit addresses in some order, like a.b.c.1, a.b.c.2, a.b.c.3, etc.
Keep the scanner’s IP address uniform across the trace, and flag it in the meta-data. And for the destinations of the sans, use different mapping. For exmaple: X1, X2 belongs to one subnet Y
Not involve scanner, map to X’1, X’2 in subnet Y’ Involve scanner, map to X’’1, X’’2 in subnet Z1 and Z2
Network layer (3)
Multicast addresses preserved
Private addresses preserved
Invalid addresses Remap it as the subnet existed, but note this
information in the meta-data.
Transport layer
Preserve both port numbers and sequence numbers
Rewrite timestamp options Transform the timestamp into separate
increasing counters Reason: Clock drift manifest in timestamp
options can be leveraged to fingerprint a physical machine
Testing
Can the transformed traces really be used? Use p0f to do OS fingerprinting Use tcpsum to find the number of packets and b
ytes in both the original and transformed traces
Test cont.
Are the transformed traces really anonymous? Check tcpmkpub’s own log file Look for some string in the anonymized traces
e.g. “Document”, “Setting”, “ConfirmFIleOp” Look for like IP addresses Look for string versions of IP addresses MAC addresses Check timestamps
Paper contributions
Develop a tool, tcpmkpub, for implementing arbitrary anonymization policy;
Use meta-data to help researchers to deal with lost information Invalid checksum, scanner IP
Beyond IP address obfuscation, explore many other dangerous details timestamp, Ethernet addresses, etc.
Paper weaknesses
Only give two experiments to show the anonymized traces are useful
Could have given some anonymization results to make the policy more clear. For example, in the scanner case, addresses a.
b.c.1, a.b.c.2, a.b.c.3, what they would look like if they are involved in scaning traffic, and what if not
Future work
Keep more consistency between the original and anonymized traces
Study online anonymizationProvide a tool which can be easily used for
validation the anonymized tracesProvide a tool for creating an anonymizatio
n policy for tcpmkpub
Questions?