Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science...

13
Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst

Transcript of Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science...

Page 1: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Signatures As Threats to Privacy

Brian Neil LevineAssistant Professor

Dept. of Computer ScienceUMass Amherst

Page 2: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

A Privacy Framework

• Your identity is composed of private details.– Some secured: password– Some protected: database inference (ppdm), RFID??– Some mundane: name, phone, purchases, movements,

contacts

• Your actions leave signatures – Distinguishing, repeated statistical features, not

necessarily unique.

• A collection of details may – allow access to a valued resource (identity theft):

• name, address, account number is access to a credit card– Or allow identity profiling. Some details seem innocuous,

but may be useful to others when linked together:• Email address and recent book purchase is good for

spammers• Name and recent web sites visited is good for big brother.

Page 3: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Signatures

• One type of signature is a user signature– Characteristics that result from your

behavior and are persistent over time.– The web sites you visit.– The content of the sites you visit.– The path or roads you take to work each

day.

Page 4: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Visiting the Same Web site over time

Page 5: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

A simple example: User Interest Signatures• We took a 9-month collection of web browser

traffic of 16 volunteers at UMass

• We represented each user as a statistical distribution of words, ignoring phrases, order, and semantics (a language model).

• We looked for words that differentiate users from the community model (using Kullback-Leibler divergence).

• We split the trace in half, and see if web pages retrieved in the second half can be matched to users from the first.

Page 6: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

A simple experiment

• 625 to 12,548 retrievals per user in second half (avg 3,400)

• Graph shows the accuracy of the top 1000 or 100 pages for each user

• Some users are predictable, some are not (likely it is based on how much news they read.)

• Some difficulties but a promising approach…

Page 7: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Network Traffic Signatures

• Signatures of User Interest can be protected by an encrypted connection– protects what words you are reading.

• But, can I still guess the web site you are visiting without knowing the content?– SSL doesn’t multiplex requests: the size of each object is

easily known! [Danezis].– You can give each web size a signature based on object

sizes. [Sun et al]

• What if we multiplex the streams? – Any VPNs and WEP-like protection will do this. – Pipelined HTTP 1.1 has a similar effect.– Can network timing characteristics leak a signature?

Page 8: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Yahoo.com

Time

Cu

mu

lati

ve B

ytes

re

ceiv

ed

Google.com

Time

Cu

mu

lati

ve B

ytes

re

ceiv

ed

Page 9: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Experiment

• Five months of a Mozilla browser visiting 100 sites (most popular from previous study) once each every 30 minutes.

– We recorded the encrypted version of each request.– Data was broken up into two halves: training and testing.

• Two methods of characterization– The ordered packet sizes: We don’t care when they

arrived– The ordered packet interarrival times: We don’t care

about their size.

• Comparison by cross correlation.

Page 10: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.
Page 11: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Defenses

• Packet size:– Easiest to measure from any point in the Internet– But, this is easiest to fix at Access Point base

station, or VPN endpoint:• You can pad packets easily to thwart attackers.• Do we need to pad acks? • Does this kill perfomance?

• Interarrival times:– This is harder to fix,

as there are many sources of delay.

– We are experimenting with adding random noise at an 802.11 AP.

– Performance question more relevant here

Page 12: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Extrapolating…

• Can we flip the attack on its head?– Capture traffic going to and from a user; does the

traffic identify her later given a repeat in visited sites?

• Can we combine this technique with other signatures for a more robust attack:– There are other dependencies that can be brought

into the probabilities – Each user has a collection of web sites they

historically visit– Given that one web site has been identified, does

that influence our next guess?– Given our location (a café), time of day, etc…,

• Personal mobile privacy firewalls as a solution?

Page 13: Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.

Lalana’s Questions…

• What are technologies that lead to loss of privacy?

• Is the loss of privacy worth the advantages?

• Are there technologies that provide some level of privacy control to users? -

• Should we be concerned about the "second hand" privacy loss that can occur when my wearable computing system records my interactions with you with or without your consent?

• What is the best approach for pervasive computing systems: more personalization or greater privacy?