Tracing Information Flows Between Ad Exchanges Using...
Transcript of Tracing Information Flows Between Ad Exchanges Using...
![Page 1: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/1.jpg)
Tracing Information Flows Between Ad ExchangesUsing Retargeted Ads
Muhammad Ahmad Bashir, Sajjad Arshad, William Robertson, Christo Wilson
Northeastern University
![Page 2: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/2.jpg)
Your Privacy Footprint
2
![Page 3: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/3.jpg)
Your Privacy Footprint
2
![Page 4: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/4.jpg)
Your Privacy Footprint
2
![Page 5: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/5.jpg)
Your Privacy Footprint
2
![Page 6: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/6.jpg)
Your Privacy Footprint
2
![Page 7: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/7.jpg)
Your Privacy Footprint
2
![Page 8: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/8.jpg)
Real Time Bidding
• RTB brings more flexibility in the ad ecosystem.• Ad request managed by an Ad Exchange which holds an auction.
• Advertisers bid on each ad impression.
• RTB spending to cross $20B by 2017[1].• 49% annual growth.
• Will account for 80% of US Display Ad spending by 2022.
3
[1] http://www.prnewswire.com/news-releases/new-idc-study-shows-real-time-bidding-rtb-display-ad-spend-to-grow-worldwide-to-208-billion-by-2017-228061051.html
Exchange Advertiser
Cookie matching is a prerequisite.
![Page 9: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/9.jpg)
Real Time Bidding (RTB)
4
GET, CNN’s Cookie
GET, DoubleClick’s Cookie
User Publisher Ad Exchange Advertisers
Solicit bids, DoubleClick’s Cookie
Bid
![Page 10: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/10.jpg)
Real Time Bidding (RTB)
4
GET, CNN’s Cookie
GET, DoubleClick’s Cookie
User Publisher Ad Exchange Advertisers
Solicit bids, DoubleClick’s Cookie
GET, RightMedia’s Cookie
Advertisement
Bid
![Page 11: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/11.jpg)
Real Time Bidding (RTB)
4
GET, CNN’s Cookie
GET, DoubleClick’s Cookie
User Publisher Ad Exchange Advertisers
Solicit bids, DoubleClick’s Cookie
GET, RightMedia’s Cookie
Advertisement
Bid
Advertisers cannot read their cookie!
![Page 12: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/12.jpg)
Cookie Matching
Key problem: Advertisers cannot read their cookies in the RTB auction• How can they submit reasonable bids if they cannot identify the user?
Solution: cookie matching• Also known as cookie synching
• Process of linking the identifiers used by two ad exchanges
5
GET, Cookie=12345
GET ?dblclk_id=12345, Cookie=ABCDE
301 Redirect, Location=http://criteo.com/?dblclk_id=12345
![Page 13: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/13.jpg)
Cookie Matching
Key problem: Advertisers cannot read their cookies in the RTB auction• How can they submit reasonable bids if they cannot identify the user?
Solution: cookie matching• Also known as cookie synching
• Process of linking the identifiers used by two ad exchanges
5
GET, Cookie=12345
GET ?dblclk_id=12345, Cookie=ABCDE
301 Redirect, Location=http://criteo.com/?dblclk_id=12345
![Page 14: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/14.jpg)
Prior Work
• Several studies have examined cookie matching• Acar et al. found hundreds of domains passing identifiers to each other
• Olejnik et al. found 125 exchanges matching cookies
• Falahrastegar et al. analyzed clusters of exchanges that share the exact same cookies
• These studies rely on studying HTTP requests/responses.
6
![Page 15: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/15.jpg)
Challenge 1: Server Side Matching
7
1)
2)
Criteo observes the user.
(IP: 207.91.160.7)
RightMedia observes the user.
(IP: 207.91.160.7)
Behind the scene, RightMedia and Criteo sync up.
(IP: 207.91.160.7)
![Page 16: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/16.jpg)
Challenge 2: Obfuscation
8
GET %^$ck#&93#&, Cookie=XYZYX
amazon.com
dbclk.js
![Page 17: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/17.jpg)
Challenge 2: Obfuscation
8
GET %^$ck#&93#&, Cookie=XYZYX
amazon.com
dbclk.js
![Page 18: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/18.jpg)
Challenge 2: Obfuscation
8
GET %^$ck#&93#&, Cookie=XYZYX
amazon.com
dbclk.js
![Page 19: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/19.jpg)
Goal
Develop a method to identify information flows (cookie matching) between ad exchanges
• Mechanism agnostic: resilient to obfuscation
• Platform agnostic: detect sharing on the client- and server-side
9
?
![Page 20: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/20.jpg)
Key Insight: Use Retargeted Ads
Retargeted ads are the most highly targeted form of online ads
10
Key insight: because retargets are so specific, they can be used to conduct controlled experiments
• Information must be shared between ad exchanges to serve retargeted ads
$15.99
![Page 21: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/21.jpg)
Contributions
1. Novel methodology for identifying information flows between ad exchanges
2. Demonstrate the impact of ad network obfuscation in practice• 31% of cookie matching partners cannot be identified using heuristics
3. Develop a method to categorize information sharing relationships
4. Use graph analysis to infer the roles of actors in the ad ecosystem
11
![Page 22: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/22.jpg)
Contributions
1. Novel methodology for identifying information flows between ad exchanges
2. Demonstrate the impact of ad network obfuscation in practice• 31% of cookie matching partners cannot be identified using heuristics
3. Develop a method to categorize information sharing relationships
4. Use graph analysis to infer the roles of actors in the ad ecosystem
11
![Page 23: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/23.jpg)
Data CollectionClassifying Ad Network FlowsResults
12
![Page 24: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/24.jpg)
Using Retargets as an Experimental Tool
This implies a causal flow of information from Exchange Advertiser
13
Key observation: retargets are only served under very specific circumstances
1)
2)
Advertiser observes the user at a shop
Advertiser and the exchange must have matched cookies
![Page 25: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/25.jpg)
Data Collection Overview
14
150 Publishers15 pages/publisher
Single Persona10 websites/persona 10 products/website
Visit Persona Visit Publishers
Store Images,Inclusion Chains,HTTP requests/
responses
571,636 Images
![Page 26: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/26.jpg)
Data Collection Overview
14
150 Publishers15 pages/publisher
Single Persona10 websites/persona 10 products/website
Visit Persona Visit Publishers
Store Images,Inclusion Chains,HTTP requests/
responsesPotential Targeted
Ads31,850
Ad Detection
Filter Images which appeared in > 1 persona
90
Pe
rso
nas
571,636 Images
{
![Page 27: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/27.jpg)
Data Collection Overview
14
150 Publishers15 pages/publisher
Single Persona10 websites/persona 10 products/website
Visit Persona Visit Publishers
Store Images,Inclusion Chains,HTTP requests/
responsesPotential Targeted
Ads31,850
Ad DetectionIsolated Retargeted Ads
Filter Images which appeared in > 1 persona
90
Pe
rso
nas
571,636 Images
Crowd Sourcing
{
![Page 28: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/28.jpg)
Crowd Sourcing
15
We used Amazon Mechanical Turk (AMT) to label 31,850 ads.
• Total 1,142 Tasks.
• 30 ads / Task.
• 27 unlabeled.
• 3 labeled by us.
• 2 workers per ad.
• $415 spent.
![Page 29: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/29.jpg)
Crowd Sourcing
15
We used Amazon Mechanical Turk (AMT) to label 31,850 ads.
• Total 1,142 Tasks.
• 30 ads / Task.
• 27 unlabeled.
• 3 labeled by us.
• 2 workers per ad.
• $415 spent.
![Page 30: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/30.jpg)
Crowd Sourcing
15
We used Amazon Mechanical Turk (AMT) to label 31,850 ads.
• Total 1,142 Tasks.
• 30 ads / Task.
• 27 unlabeled.
• 3 labeled by us.
• 2 workers per ad.
• $415 spent.
![Page 31: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/31.jpg)
Crowd Sourcing
15
We used Amazon Mechanical Turk (AMT) to label 31,850 ads.
• Total 1,142 Tasks.
• 30 ads / Task.
• 27 unlabeled.
• 3 labeled by us.
• 2 workers per ad.
• $415 spent.
![Page 32: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/32.jpg)
Final Dataset
5,102 unique retargeted ads• From 281 distinct online retailers
35,448 publisher-side chains that served the retargets• We observed some retargets multiple times
16
![Page 33: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/33.jpg)
Data CollectionClassifying Ad Network FlowsResults
17
![Page 34: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/34.jpg)
A look at Publisher Chains
18
Exam
ple
Shopper-side chain Publisher-side chain
• How does Criteo know to serve ad on BBC?• In this case it is pretty trivial.• Criteo observed us on the shopper.
• Can we classify all such publisher-side chains?
![Page 35: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/35.jpg)
What is a Chain?
19
![Page 36: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/36.jpg)
What is a Chain?
19
a
a
e
e
![Page 37: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/37.jpg)
What is a Chain?
19
^pub .* e a$
a
a
e
e
![Page 38: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/38.jpg)
Four Classifications
Four possible ways for a retargeted ad to be served 1. Direct (Trivial) Matching
2. Cookie Matching
3. Indirect Matching
4. Latent (Server-side) Matching
20
![Page 39: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/39.jpg)
Four Classifications
Four possible ways for a retargeted ad to be served 1. Direct (Trivial) Matching
2. Cookie Matching
3. Indirect Matching
4. Latent (Server-side) Matching
20
![Page 40: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/40.jpg)
1) Direct (Trivial) Matching
21
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop .* a .*$ ^pub a$
a is the advertiser that
serves the retarget
![Page 41: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/41.jpg)
1) Direct (Trivial) Matching
21
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop .* a .*$ ^pub a$
a is the advertiser that
serves the retarget
a must appear on the shopper-
side…
… but other trackers may also appear
![Page 42: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/42.jpg)
2) Cookie Matching
22
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop .* a .*$ ^pub .* e a$
e precedes a, which implies an
RTB auction
![Page 43: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/43.jpg)
2) Cookie Matching
22
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop .* a .*$
a must appear on the
shopper-side
^pub .* e a$
e precedes a, which implies an
RTB auction
![Page 44: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/44.jpg)
2) Cookie Matching
22
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop .* a .*$
a must appear on the
shopper-side
^pub .* e a$^* .* e a .*$
Anywhere
e precedes a, which implies an
RTB auction
Transition ea is where cookie match occurs
![Page 45: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/45.jpg)
3) Latent (Server-side) Matching
23
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop [^ea]$
Neither e nor aappears on the shopper-side
^pub .* e a$
![Page 46: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/46.jpg)
3) Latent (Server-side) Matching
23
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop [^ea]$
Neither e nor aappears on the shopper-side
^pub .* e a$
a must receive information from some shopper-side tracker
![Page 47: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/47.jpg)
3) Latent (Server-side) Matching
23
Shopper-side Publisher-side
Exam
ple
Ru
le ^shop [^ea]$
Neither e nor aappears on the shopper-side
^pub .* e a$
a must receive information from some shopper-side tracker
We find latent matches in practice!
![Page 48: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/48.jpg)
Data CollectionClassifying Ad Network FlowsResults
24
![Page 49: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/49.jpg)
Categorizing Chains
Type Chains % Chains %
Direct (Trivial) Match 1770 5 8449 24
Cookie Match 25049 71 25873 73
Latent (Server-side) Match 5362 15 343 1
No Match 775 2 183 1
25
Clustered
Take away:1- As expected, most retargets are due to cookie matching
2- Very small number of chains that cannot be categorized
• Suggests low false positive rate of AMT image labeling task
3- Surprisingly large amount latent matches…
Raw Chains
![Page 50: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/50.jpg)
Categorizing Chains
Type Chains % Chains %
Direct (Trivial) Match 1770 5 8449 24
Cookie Match 25049 71 25873 73
Latent (Server-side) Match 5362 15 343 1
No Match 775 2 183 1
26
Raw ChainsClustered
Chains
Cluster together domains by “owner”• E.g. google.com, doubleclick.com, googlesyndication.com
![Page 51: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/51.jpg)
Categorizing Chains
Type Chains % Chains %
Direct (Trivial) Match 1770 5 8449 24
Cookie Match 25049 71 25873 73
Latent (Server-side) Match 5362 15 343 1
No Match 775 2 183 1
26
Raw ChainsClustered
Chains
Cluster together domains by “owner”• E.g. google.com, doubleclick.com, googlesyndication.com
Latent matches essentially disappear• The vast majority of these chains involve Google
• Suggests that Google shares tracking data across their services
![Page 52: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/52.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 53: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/53.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 54: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/54.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 55: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/55.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 56: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/56.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 57: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/57.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
![Page 58: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/58.jpg)
Who is Cookie Matching?
Participant 1 Participant 2 Chains Ads Heuristics
criteo googlesyndication 9090 1887 P
criteo doubleclick 3610 1144 E, P DC, P
criteo adnxs 3263 1066 E, P
criteo rubiconproject 1586 749 E, P
criteo servedbyopenx 707 460 P
doubleclick steelhousemedia 362 27 P E, P
mathtag mediaforge 360 124 E, P
netmng scene7 267 119 E ?
googlesyndication adsrvr 107 29 P
rubiconproject steelhousemedia 86 30 E
googlesyndication steelhousemedia 47 22 ?
adtechus adacado 36 18 ?
atwola adacado 32 6 ?
adroll adnxs 31 8 ? 27
Heuristics Key (used by prior work)
E – share exact cookies
P – special URL parameters
DC – DoubleClick URL parameters
? – Unknown sharing method
31% of cookie matching partners would be missed.
![Page 59: Tracing Information Flows Between Ad Exchanges Using ...nsec.sjtu.edu.cn/teaching/assets/2017spring/... · •RTB brings more flexibility in the ad ecosystem. •Ad request managed](https://reader034.fdocuments.us/reader034/viewer/2022042417/5f3272b26c25b711e31a9e20/html5/thumbnails/59.jpg)
Summary
We develop a novel methodology to detect information flows between ad exchanges
• Controlled methodology enables causal inference
• Defeats obfuscation attempts
• Detects client- and server-side flows
Dataset gives a better picture of ad ecosystem• Reveals which ad exchanges are linking information about users
• Allows us to reason about how information is being transferred
28