1 Privacy-protecting Techniques IS/CS 698 Min Song.
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of 1 Privacy-protecting Techniques IS/CS 698 Min Song.
1
Privacy-protecting Techniques
IS/CS 698Min Song
2
Web Security: Big Brother is watching you - traces you leave in
the Web1. User-provided information Setting up an account with an online shop,
purchasing tickets via the web What legal restrictions govern the use of the
information provided? (in the US: few) What privacy policy does the vendor have? (the
vendor’s web site should have some information about this!)
Note: even simple demographic information may be sufficient for identification (e.g. ZIP-code + birthday)
3
Further traces you leave
2. Log files Ubiquitous; lots of applications and network
programs create log files of various activities the user performs
Web logs: files which are created on remote web server when a page is downloaded; information stored includes: IP address of the computer that downloaded the web page, time of requrest, URL requested, username (sometimes), referral link, web browser used
Mail logs: created by mail servers, contain at least to: and from:
4
3. Cookies Text file left by a remote web server on
your computer; the cookie is sent to the web server every time a web page from that server is requested
Allows the user to maintain a certain state while requesting different pages; example: shopping cart
5
Where are cookies stored? Netscape Navigator: ~/.netscape/cookies Microsoft Explorer: folder Cookies
Browser preferences can be adjusted so that you can refuse cookies when they are sent
6
4. Web Bugs Simple to program in HTML <img src=“http://…” width=1 height=1
border=0> This will alert the specified web server every
time the page is viewed; outsourced web site monitoring
Useful for gathering web use statistics Can also be used to check when HTML e-mails
are read, or to send personal information encoded in URL (like Google puts search strings in URL)
7
Privacy-Protecting Techniques
Picking a good password (and not writing it down)
Passwords can be captured by password sniffers when they are transmitted over the network; beware of protocols which do NOT use encryption: FTP (File Transfer Protocol) HTTP (Hypertext Transfer Protocol) POP (Post Office Protocol) TELNET (Remote Terminal Protocol) RLOGIN (Remote Login for UNIX machines)
8
More Privacy-Protecting Techniques
Avoiding Spam and Junk Mail Don’t put your e-mail address on your
home page, or write “pauly (and now this strange symbol) csc.liv.ac in you know which country)”
Take your e-mail out of online directories Don’t post to public mailing lists Pick an unusual username Address Mangling (Munging):
9
Privacy-Protecting Technologies
Antispam Services: analyse your e-mail to check whether it is spam, using AI technology, whitelists + confirmation e-mails, etc. Brightmail personal edition
(www.brightmail.com) Antispam Software: does the same, but
runs on your computer; your mail stays where it belongs Spammerslammer (www.nowtools.com)
Browsers allow you to refuse cookies
10
More Privacy-Protecting Technology
Anonymous Browsing – protecting your IP address Use a public terminal (e.g. at a library) Use a proxy server of your internet service
provider; in this case, the proxy servers IP address is passed on
Use anonymous web browsing services; they usually work as proxy servers
Secure E-mail: encrypt messages before sending Hushmail: http://www.hushmail.com/
11
Secure Sockets Layer (SSL)
Uses a cryptographic protocol for sending information over the web
Main usage with web pageshttps://…
Browsers will usually tell you whether the current page/document is “secure”
Example: Booking a flight with a credit card…
12
Browser Cookie Management
Cookie ownership Once a cookie is saved on your computer, only the
Web site that created the cookie can read it. Variations
Temporary cookies Stored until you quit your browser
Persistent cookies Remain until deleted or expire
Third-party cookies Originates on or sent to a web site other than the
one that provided the current page
13
Storing Info Across Sessions
A cookie is a file created by an Internet site to store information on your computer
BrowserServer
Enters form data
Stores cookie
BrowserServer
Requests cookie
Returns data
HTTP is a stateless protocol; cookies add state
Includes domain (who can read it), expiration, “secure” (can be read only over SSL)
14
What Are Cookies Used For?
Authentication Use the fact that the user authenticated
correctly in the past to make future authentication quicker
Personalization Recognize the user from a previous visit
Tracking Follow the user from site to site; learn
his/her browsing behavior, preferences, and so on
15
Privacy Issues with Cookies
Cookie may include any information about you known by the website that created it Browsing activity, account information, etc.
Sites can share this information Advertising networks 2o7.net tracking cookie
Browser attacks could invade your “privacy”November 8, 2001:
Users of Microsoft's browser and e-mail programs could be vulnerable to having their browser cookies stolen or modified due to a new security bug in Internet Explorer (IE), the company warned today
16
Austin American-Statesman
The website “adinterax.com” hasrequested to save a file on your computer called a “cookie.” Thisfile may be used to track usageinformation…
17
The Weather Channel
The website “twci.coremetrics.com” has requested to save a file on your computer called a “cookie.” Thisfile may be used to track usageinformation…
18
MySpace
The website “insightexpressai.com” has requested to save a file on your computer called a “cookie”…
19
Let’s Take a Closer Look…
20
Storing State in Browser
Dansie Shopping Cart (2006) “A premium, comprehensive, Perl shopping cart. Increase your web
sales by making it easier for your web store customers to order.”
<FORM METHOD=POST
ACTION="http://www.dansie.net/cgi-bin/scripts/cart.pl">
Black Leather purse with leather straps<BR>Price: $20.00<BR>
<INPUT TYPE=HIDDEN NAME=name VALUE="Black leather purse"> <INPUT TYPE=HIDDEN NAME=price VALUE="20.00"> <INPUT TYPE=HIDDEN NAME=sh VALUE="1"> <INPUT TYPE=HIDDEN NAME=img VALUE="purse.jpg"> <INPUT TYPE=HIDDEN NAME=custom1 VALUE="Black leather purse
with leather straps">
<INPUT TYPE=SUBMIT NAME="add" VALUE="Put in Shopping Cart">
</FORM>
Change this to 2.00
Bargain shopping!
21
Storing State in Browser Cookies
Set-cookie: price=299.99 User edits the cookie… cookie:
price=29.99 What’s the solution? Add a MAC to every cookie, computed
with the server’s secret key Price=299.99; HMAC(ServerKey, 299.99)
But what if the website changes the price?
22
Web Authentication via Cookies
Need authentication system that works over HTTP and does not require servers to store session data Why is it a bad idea to store session state on server?
Servers can use cookies to store state on client When session starts, server computes an authenticator
and gives it back to browser in the form of a cookie Authenticator is a value that client cannot forge on his
own Example: hash(server’s secret key, session id)
With each request, browser presents the cookie Server recomputes and verifies the authenticator
Server does not need to remember the authenticator
23
Typical Session with Cookiesclient server
POST /login.cgi
Set-Cookie:authenticator
GET /restricted.htmlCookie:authenticator
Restricted content
Verify that thisclient is authorized
Check validity ofauthenticator(e.g., recomputehash(key,sessId))
Authenticators must be unforgeable and tamper-proof
(malicious client shouldn’t be able to compute his own or modify an existing authenticator)
24
Third-party cookies
Get a page from merchant.com Contains <img src=http://doubleclick.com/advt.gif> Image fetched from DoubleClick.com
DoubleClick knows IP address and page you were looking at
DoubleClick sends back a suitable advertisement Stores a cookie that identifies "you" at DoubleClick
Next time you get page with a doubleclick.com image Your DoubleClick cookie is sent back to DoubleClick DoubleClick could maintain the set of sites you viewed Send back targeted advertising (and a new cookie)
Cooperating sites Can pass information to DoubleClick in URL, …
25
Example: Mortgage Center
<html><title>Mortgage Center</title><body>… http://www.loanweb.com/ad.asp?RLID=0b70at1ep0k9
What’s this?
26
Cookie issues
Cookies maintain record of your browsing habits Cookie stores information as set of
name/value pairs May include any information a web site
knows about you Sites track your activity from multiple
visits to site Sites can share this information (e.g.,
DoubleClick) Browser attacks could invade your
“privacy”
27
Sample Proxy:
Cookie management by policy in cookiefile Default: all cookies are silently crunched Options
Allow cookies only to/from certain sites Block cookies to browser (but allow to server) Send vanilla wafers instead
Block URLs matching any pattern in blockfile Example: pattern /*.*/ad matches
http://nomatterwhere.com/images/advert/g3487.gif
Easy to write your own http proxy; you can try this at home
28
Fooling the user
password?
Sends email: “There is a problem with your eBuy account”
User clicks on email link to www.ebuj.com.
User thinks it is ebuy.com, enters eBuy username and password.
Password sent to bad guy
29
Password Phishing Problem
User cannot reliably identify fake sites Captured password can be used at target site
Bank A
Fake Site
pwdApwdA
30
Common Password Problem
Phishing attack or break-in at site B reveals pwd at A Server-side solutions will not keep pwd safe Solution: Strengthen with client-side support
Bank A
low security site
high security site
pwdA
pwdB
= pwdA
Site B
31
Password Hashing
Generate a unique password per site HMACfido:123(banka.com) Q7a+0ekEXb HMACfido:123(siteb.com) OzX2+ICiqc
Hashed password is not usable at any other site Protects against password phishing Protects against common password problem
Bank A
hash(pwdB, SiteB)
hash(pwdA, BankA)
Site B
pwdA
pwdB
=
32
Password Hashing: a popular idea
Recent password hashing projects: Password Generator Extension Password Composer Passwdlet Genpass PwdHash:
http://crypto.stanford.edu/PwdHash/RemotePwdHash/
Similar hashing algorithms Only PwdHash defends against spoofing and is
invisible to the user
33
The Spoofing Problem
JavaScript can display password fields or dialogs:
Unhashed password sent to attacker in clear
34
Password Prefix
Original pwdshould never be visibleto web page
OzX2+ICiqcSite B
@@fido:123
@@fido:123 @@abcdefgh
35
Password Prefix: How it works
Normal operation: Prefix in password field
Abnormal operation: Prefix in non-password field
Can just ignore the prefix and not hash Remind user not to enter password
@@fido:123 @@abcdefgh **********
abcdefgh fido:123
HMACfido:123(siteb.com) Q7a+0ekEXb
36
The Perfect Phishing Email
Bank of America customers see: “Click here to see your Bank of America
statement” Wells Fargo customers see:
“Click here to see your Wells Fargo statement”
Works in Outlook; behavior is by design
Fooling the user using browser state
37
Reading browser history
CSS properties of hyperlinks Can also use cache-based techniques
Violation of the same-origin principle:“One site cannot use information belonging to another site.”
38
Visited link tracking
Visited links displayed in different color (74% of sites) Information easily accessible by javascript
Attacks also without javascript
Bank logo images are stacked on top of each other CSS rules cause the un-visited links to vanish Page displays bank logo of site that user has visited
<html><head><style> a { position:absolute; border:0; } a:link { display:none } </style></head><body><a href='http://www.bankofamerica.com/'><img src='bankofamerica.gif'></a><a href='https://www.wellsfargo.com/'><img src='wellsfargo.gif'></a><a href='http://www.usbank.com/'><img src='usbank.gif'></a>...</body></html>
http://www.safehistory.com/
39
Preserving web privacy
Your IP address may be visible to web sites This may reveal your employer, ISP, etc. Can link activities on different sites, different
times Can you prevent sites from learning about you?
Anonymizer Single site that hides origin of web request
Crowds Distributed solution
40
Browsing Anonymizers
Web Anonymizer hides your IP address
What does anonymizer.com know about you?
BrowserServerAnonymizer
www.anonymizer.com/cgi-bin/redirect.cgi?
url=…
41
Related approach to anonymity
Hide source of messages by routing them randomly Routers don’t know for sure if the apparent source of
the message is the actual sender or simply another router
Existing systems: Freenet, Crowds, etc.
42
Crowds [Reiter,Rubin ‘98]
C C4
C1
C2
C
C
CC3
C0
sender recipient
C
C
C
Cpf
1-pf
Sender randomly chooses a path through the crowd Some routers are honest, some corrupt After receiving a message, honest router flips a coin
With probability Pf routes to the next member on the path
With probability 1- Pf sends directly to the recipient
43
What Does Anonymity Mean?
Beyond suspicion The observed source of the message is no more
likely to be the actual sender than anybody else Probable innocence
Probability <50% that the observed source of the message is the actual sender
Possible innocence Non-trivial probability that the observed source of
the message is not the actual sender
Guaranteed by Crowds if there aresufficiently few corrupt routers
44
How web sites use your information
You may enter information to buy product Name, address, credit card number, …
How will web site use this information Charge your card and mail your purchase Give sales information to other businesses?
Platform for privacy preferences (P3P) Framework for reaching agreement on use of
personal information Enforcement at server side is another matter…
45
Basic P3P Concepts
useragent
user datarepository
preferences
service
proposal
agreementuser
datapractices
Credit: Lorrie Cranor
46
A Simple P3P Conversation
useragent
service
User agent: Get index.htmlService: Here is my P3P proposal - I collect click-stream
data and computer information for web site and system administration and customization of site
User agent: OK, I accept your proposalService: Here is index.html
47
CAPTCHAs
A CAPTCHA is a type of challenge-response test used in computing to determine whether the user is human.
Stands for Completely Automated Public Turing test to tell Computers and Humans Apart, trademarked by Carnegie Mellon University.
Also known as Reverse Turing Tests.
48
CAPTCHAs
Public: Details of algorithm used to generate the tests are publicly available. www.captcha.net/
Mostly used for spam-fighting related purposes Prevent automated email account
creation: Hotmail,Yahoo! Mail. Make sender of email confirm he or she is
a human before the message is delivered to its recipient.
49
CAPTCHAs: Example
from Yahoo! Mail new account signup: uses EZ-Gimpy
Gimpy (www.captcha.net): http://captchas.net/sample/php/query.php
Gimpy has been broken:Greg Mori and Jitendra Malik (UC Berkeley Computer Vision Group): Recognizing Objects in Adversarial Clutter: Breaking a Visual CAPTCHA, CVPR 2003.92% success against EZ-Gimpy, 33% against Gimpy.
50
Captcha Disadvantages
Accessibility issues Visually impaired people
Audio captchas now exist, but are they being used in practice?
Text-only users Lynx users Cell phone WAP browser users
Usability/inconvenience issues Forcing customers to answer a captcha
may drive some of them to a competitor