CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking...

30
CrowdSurf Empowering Transparency in the Web Hassan Metwalley Stefano Traverso Marco Mellia Stanislav Miskovic Mario Baldi 25 Aug 2016, ACM SIGCOMM, Florianopolis

Transcript of CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking...

Page 1: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurfEmpowering Transparencyin the Web

Hassan MetwalleyStefano Traverso

Marco MelliaStanislav Miskovic

Mario Baldi

25 Aug 2016,ACM SIGCOMM, Florianopolis

Page 2: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 2

Introduction

Page 3: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Do you know what you HTTP?

26August2016 CrowdSurf- StefanoTraverso 3

Page 4: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Example Web tracking

Thousands of Web trackers collect our dataq Browsing historiesq Religious, sexual, and political preferences

qOn average, the first tracker is met as soon as the browser starts [1]

qSome trackers reach 96% of users [1]q71% of websites host at least one tracker [1]

[1] Metwalley, H. et al. “The Online Tracking Horde: A View from Passive Measurements”, TMA 2015

26August2016 CrowdSurf- StefanoTraverso 4

Page 5: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

The Open Question

How to know and choose which services our data is exchanged with and how?

26August2016 CrowdSurf- StefanoTraverso 5

Page 6: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Partial solutions

In-network devicesq Firewalls and proxies

ØFail in case of encrypted traffic (HTTPS)

ØLack scalabilityØManaged by third

parties

26August2016 CrowdSurf- StefanoTraverso 6

On-clientq Browser plugins

ØLimited scopeØNo control on

device trafficØNot transparent

Page 7: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

q Holisticworking in any scenario

q Client-centricavailable on any kind of device

q Practical, not revolutionaryuse existing technology

q Crowd-sourcedknowledge built on a community of users

q Automaticlittle engagement of the user

q Privacy-safenever compromise users’ privacy

GoalLet users re-gain visibility and control on the information

they exchange with Web services

A New System

26August2016 CrowdSurf - StefanoTraverso 7

Design Principles

Page 8: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 8

CrowdSurf

Page 9: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf

26August2016 CrowdSurf- StefanoTraverso 9

Cloudq A controller collects

information about the services users visitØ Explicit -> their opinionØ Implicit -> traffic samples

q Users’ contributions processed by data-analyzers and the advising community

q Results = suggestions about the reputation of services

Clientq Users download the

suggestions they like q the CrowdSurf Layer

translates them into rulesq Rules = actions on users’

trafficØ Regexp + action

Page 10: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf Controllers

26August2016 CrowdSurf - StefanoTraverso 10

Open ControllerqCollaborative approach qUsers improve the wisdom

of the systemØ Traffic samples and

opinionsØ Build data analyzers and

suggestions

Corporate ControllerqBuilds directly rules for

employeesqEmployees can not

customize rulesqAll devices follow the

same rules

Page 11: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

HTTP

TLS

TCP

Open Controller

Corporate

Controller

Sugg

estio

nsto

Rule

s

Cro

wd

Surf

Laye

rRu

lePr

oces

sor

Action

Redirect

Regular Expression Matching

ModifyAllowBlock

Log andReport

The CrowdSurf Layer

Anonymization

Page 12: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf in a picture

26August2016 CrowdSurf- StefanoTraverso 12

Web Services

Opinions+

Trafficsamples

Suggestions

TrafficsamplesRules

RuledInteraction

Open ControllerCorporateController

Page 13: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 13

Proof of Concept

Page 14: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Prototype

26August2016 CrowdSurf- StefanoTraverso 14

Controllerq Java-based web serviceq Communicates with

CrowdSurf devicesq Hosts a data analyzer for

identification of tracking sites

q Collects traffic samplesq Distributes suggestions

Clientq Implemented as a Firefox

pluginq Supports block, redirect, log&report

Page 15: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Example of Data Analyzer: Automatic Tracker Detector

26August2016 CrowdSurf- StefanoTraverso 15

Unsupervised methodology to identify third-party trackers [2]q Observation:

q trackers usually embed UIDs as URL parameters q Procedure:

1. Input: HTTP traffic samples provided by CS users2. Take all HTTP queries to third-party services

http://acmetrack.com/query?key1=X&key2=Y

3. Extract keys (key1, key2) and their values4. Check the presence of key values uniquely associated

to the users

[2] Metwalley, H. et al “Unsupervised Detection of Web Trackers”, IEEE Globecom 2015

Page 16: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 16

Visit 1

Time

http://acmetrack.com/query?sid=X&tmp=Y&uid=Z

Visit 3Visit 2

a b c d e f g h i

m m m n n n p p p

sid

tmp

uid x y z x y z x y z

Example of Data Analyzer: Automatic Tracker Detector

34 new third-party trackers found

Page 17: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Performance Implications of running CrowdSurf

26August2016 CrowdSurf- StefanoTraverso 17

Paranoid Profileq Blocks

q adv/trackingq JS code

q Does not report traffic samples

Kid Profileq Activates child

protection rulesq Reports traffic to

trackers

Corporate Profileq Redirectssearch.google.comto search.bing.com

q Blocks social networks, e-commerce sites, trackers

q Reports acitivity on DropBox

Different user profiles

Page 18: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Impact on Web site loading time

26August2016 CrowdSurf- StefanoTraverso 18

KidParanoid

Corporate

Paranoid is 1.07 times faster than baselineKid is 1.08 times slower

Corporate is 1.18 time slower

Page 19: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 19

Conclusion

Page 20: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Open Problems

26August2016 CrowdSurf- StefanoTraverso 20

q Lot of details to considerq Design/develop/stardardize a new network layerq Protecting users’ privacy

q Anonymizing HTTP/S trafficq Usabilityq Involve users to joinq Protection from malicious biases

Page 21: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

26August2016 CrowdSurf- StefanoTraverso 21

Holistic, crowd-sourced system for the auditing of the information we expose in

the Web

CrowdSurf

https://www.myermes.com

Page 22: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf- StefanoTraverso

Thank you!

26August2016 22

Page 23: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Need a new model that…

26August2016 CrowdSurf- StefanoTraverso 23

Enables transparency and visibility

Takes actions

Under user’s control

Monitor the HTTP traffic beforeencryption takes place

Block/manipulate/report transactions to undesired services

Automatic, but configurable

Page 24: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Example of Data Analyzer: Automatic Tracker Detector

26August2016 CrowdSurf- StefanoTraverso 24

Automatic Tracker Detector

DatasetHTTP trace from ISP running Tstatq 10 days of October 2014 q ~19k monitored usersq ~240k HTTP transactions per day

vs

Website Embedded Third-party Trackers

Portal1 26

News1 13

E-commerce1 12

E-commerce2 9

E-commerce3 4

Portal2 4

Porn 3

Sportnews 1

SearchEngine 1

New

s1

Third-party Trackers Keys

cl.adform.net xid

atemda.com bidderuid

x.bidswitch.net user_id

www.77tracking.com rand

rack.movad.net us

ovo01.webtrekk.net cs2

dis.criteo.com uid

p.rfihub.com bk-uuid

ib.adnxs.com xid

34 new third-party trackers found

Page 25: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Example A growing business around our data

26August2016 CrowdSurf- StefanoTraverso 25

[3] Metwalley, H. et al. “The Online Tracking Horde: A View from Passive Measurements”, TMA 2015

Page 26: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Loss of visibility and control

q HTTPSprotectsourprivacy,but…q …preventsthirdpartiestocheckwhat’sgoingonunderthe

hood ofencryptionq …andseverelylimitsnetworkfunctions

“Child protection through the use of Internet Watch Foundation blacklists has become ineffective, with just 5% of entries still being blocked when HTTPS is deployed” [2]

[2]Naylor,D.etal.“TheCostofthe"S"inHTTPS”,CoNEXT 2014

26August2016 CrowdSurf- StefanoTraverso 26

Page 27: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

Timetocollectadataset

26August2016 CrowdSurf- StefanoTraverso 27

googleanalytics

Page 28: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

MonitoringtheWeb

[1]Popa,L.etal.,“HTTP AstheNarrowWaistoftheFutureInternet,”ACMHotNets,2010

26August2016 CrowdSurf- StefanoTraverso 28

HTTP [1]HTTPS/HTTP 2.0

Page 29: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf Controllers

26August2016 CrowdSurf- StefanoTraverso 29

Open Controllerq Collaborative approach q Users improve the

wisdom of the systemØ Traffic samples and

opinionsØ Build data analyzers and

suggestions

Third party Controllerq Suggestions for

commercial purposesq Opens to a market of

suggestions

Corporate Controllerq Builds directly rules for

employeesq Employees can not

customize rulesq All devices follow the

same rules

Page 30: CrowdSurf - SIGCOMMconferences.sigcomm.org/sigcomm/2016/files/program/... · Example Web tracking Thousands of Web trackers collect our data q Browsing histories q Religious, sexual,

CrowdSurf in a picture

26August2016 CrowdSurf- StefanoTraverso 30

Web Services

Open controller

Traffic samples

Corporate RulesWeb Browsing

Suggestions

Corporate Device

Private User Device

Data Analyzer

Corporate controller

Third-party controller