Challenges to Privacy in New Internet Applications: VoIP, IM, location-based services
description
Transcript of Challenges to Privacy in New Internet Applications: VoIP, IM, location-based services
Challenges to Privacy in New Challenges to Privacy in New Internet Applications: VoIP, IM, Internet Applications: VoIP, IM, location-based serviceslocation-based services
Prof. Henning SchulzrinneComputer Science
Columbia University, New YorkATIS Network Security Symposium and Workshop
Washington, DCSeptember 2004
OverviewOverview Email spam: a history of failed
miracle cures New challenges emerging:
VoIP unsolicited calls instant messaging Location and presence privacy Reputation management
Not just emailNot just email Email is just first large-scale, open
communication medium initially also “closed user groups” (DECmail, PROFS,
UUnet, Fido, …) When does UBC occur?
Single domain large number of independently operated domains
removes easy remote authentication Published or guessable addresses
as old as unlisted numbers conflict between usability and using addresses as communication keys
Others emerging from closed user groups instant messaging (IM) VoIP and multimedia calls presence queries
The universe of message The universe of message senderssenders
human user(known and unknown)
opt-in bulk communicati
ons
mailing lists(forwarder)
robots(event
notification)machine
human
machine machine
(EDI)
human involvement
The problem is easy…The problem is easy… if you’re willing to make some
minor assumptions: single administrative domain only previously-known senders (but
how?) global public key infrastructure (PKI) only real human users, no lists
Communication challengesCommunication challenges Joe-job
“The act of faking a spam so that it appears to be from an innocent third party, in order to damage their reputation and possibly to trick their provider into revoking their Internet access. Named after Joes.com, which was victimized in this way by a spammer some years ago.”
Phishing “The act of sending an e-mail to a user falsely claiming to
be an established legitimate enterprise in an attempt to scam the user into surrendering private information that will be used for identity theft.”
Spam, spim (unsolicited bulk communications)
Nuisance communications
Tools available Tools available countermeasurescountermeasures From address blacklisting IP sender blacklisting Content filtering (Bayesian filters)
spam folder
MUA
MTA
mail sender
DNSPOPIMAP
SMTP marked withheader
SPF, SBL, …
Method does failspostage for email sender pays receiver for
reading mailcollection, socially unacceptable (job offer), mailing lists
Haiku (e.g., Habeas) include copyrighted haiku in non-spam
enforcement
computation(e.g., Microsoft Penny Black)
sender solves computational puzzle
lists, bots
Turing test (challenge-response)
automated senders
Graylisting return temporary failure UBC gives up
unreliable, delay
Address hiding & spoofing prevents web crawlers from picking up addresses
existing addressessingle failure
Bayesian content analysis detect spam/non-spam terms
pictures, word spoofing, poisoning, IM, VoIP, …
Bonding third party (e.g., BSP) promises bond if domain spams
enforcement
Miracle curesMiracle cures
The UBC arms raceThe UBC arms raceIP blacklisting
openrelays
From blacklisting
senderfaking bot armies
RBL, SPEWS
Bayesianfilters pictures
dictionary attacksSPF, DMP, …
““We need a new mail/IM We need a new mail/IM protocol”protocol” True: SMTP not designed for today’s hostile
Internet no sender authentication no easy policy inclusion
False: A new mail protocol is going to fix UBE/UBC Hard problems are ecosystem, not protocol:
authentication – domains and individuals PKI (S/MIME, PGP) has never scaled current email certificates just certify ownership of email
address help with whitelist, but not with unknown users too costly for true verification
reputation accreditation
IETF MARIDIETF MARID IETF working group for verifying
sender “It would be useful for those maintaining domains and networks
to be able to specify that individual hosts or nodes are authorizedto act as MTAs for messages sent from those domains or networks.This working group will develop a DNS-based mechanism forstoring and distributing information associated with that authorization.”
related to IRTF ASRG (Anti-spam Research Group)
DNS extensions, “purported responsible address”
MARID processingMARID processing“Given an email message, and given an IP address from which it has been (or will be) received, is the SMTP client at that IP address authorized to send that email message?”
extract purported
responsible address (PRA)
extract purported
responsible domain (PRD)
SPF: IP legal for
PRD?
N
Y
client SMTP validation (CSV)
Clientauthenticate
d, authorized
and accredited?
Y
MARID: Client SMTP MARID: Client SMTP Validation (CSV)Validation (CSV)
EHLO domain
real?
Host authorized to be
MTA?
Domain reputati
on?
authentication
authorization
accreditation
draft-ietf-marid-csv-intro
draft-ietf-marid-csv-csa
draft-ietf-marid-csv-dna
EHLO aol.com from 64.12.187.24
A(aol.com)IN A 64.12.187.24
SRV(_client._smtp.aol.com)SRV weight=2
PTR(aol.com)_vouch.smtp.isgood.com
TXT(aol.com.isgood.com)
IN TXT MARID,1,A
PRA (Purported responsible PRA (Purported responsible address)address) “Allows one to determine
who appears to have most recently caused an e-mail message to be delivered. It does this by inspecting the headers in the message.” (draft-ietf-marid-pra)
uses Resent-Sender, Resent-From, Sender, From RFC 2822 headers
draft-ietf-marid-submitter defines new MAIL parameter for SMTP
S: 220 company.com.example ESMTP server readyC: EHLO almamater.edu.exampleS: 250-company.com.exampleS: 250-DSNS: 250-AUTHS: 250-SUBMITTERS: 250 SIZEC: MAIL FROM:<[email protected]> [email protected]: 250 <[email protected]> sender okC: RCPT TO:<[email protected]>
[email protected] almamater.edu [email protected]
SPF, Sender-IDSPF, Sender-ID SPF (sender policy framework) Verifies that most recent
sender (e.g., mailing list forwarder) is authorized for its domain
Does not prevent spam, but enables white and black-listing
Adds DNS TXT or SPF resource record (RR) for domain
spf2.0/mfrom,pra +mx +a:192.1.2.0/28 –all
“mail from MX server for example.com and from IP 192.1.2.0 are ok; all others are bad”
HELO or EHLO
MAIL FROM
From:
body delivery
SMTPconnection
Putting the tools togetherPutting the tools together
[email protected] [email protected]
transitive trust model: intra-domain user, inter-domain domain/host-only authentication
SMTPAUTHsubmission(password)
bpm.comSMTP server
SPFCSV
accreditation:• aol.com does not host spammers• bpm.com verifies user identities (not yet)
SMTP
What’s different about IM and What’s different about IM and VoIP?VoIP? Higher nuisance factor
combine the worst of email and phone telemarketing Close to zero cost
call origination has no capacity limitation (unlike PSTN line limitation)
can be originated in volume from residential broadband – not T1 required
T1: 2.4 call attempts/second @ $1000/month + LD 500 kb/s DSL: 9 call attempts/second @ $50/month
easy to get addresses: SIP address = email address or E.164 number
non-US origin: cheap labor, no DNC laws Privacy invasion
know user is actually there Nuisance calls
possibly no good way to trace already a problem with Skype
SIP spamSIP spam Call spam
telemarketing content filtering likely ineffective
IM spam SIP MESSAGE or message sessions spam intent may not be obvious in first message
get attention first with “Hello” short messages harder to analyze with content filters
but typically requires white-listing based on presence subscription
Presence spam (request addition to watcher list) mostly nuisance – user may need to manually deny
request
J. Rosenberg, C. Jennings, draft-rosenberg-sipping-spam, July 2004
SIP spam preventionSIP spam prevention All earlier mechanisms apply, with largely the
same caveats Black lists
domain-level within domain, only if domain practices sound user
management White list
may use buddy list as white list stronger user authentication
Consent-based communication needs to subscribe first but may not be able to recognize address (“is
[email protected] a spammer or some long-lost friend?”)
SIP spam preventionSIP spam prevention Use of MARID-like DNS domain verification possible may not be needed, due to usage of TLS for interdomain
communications but doesn’t preclude rogue sub-domains e.g., “is hgs10.columbia.edu allowed to route SIP calls for
columbia.edu?” transitive trust principle:
trust that previous hop applied identity management principles
longer term, use S/MIME certificates for user-level authentication, but doesn’t improve spam prevention much
not widely available now if S/MIME certificates are cheap, spammers can mint new
identities
SIP authenticationSIP authentication
SIP trapezoid
outbound proxy
[email protected]: 128.59.16.1 registrar
voice traffic(S)RTP
destination proxy(identified by SIP URI domain)
Digest authover TLS
TLSmutual hostverification
insertcrypto-signed
identity assertion (AIB sip-identity)
From domain to user From domain to user policiespolicies Not all domains can be
classified as “good” or “bad” as a whole
Many different domain types:
Employer ISP Associations (IEEE,
ACM, ATIS, …) Personal domains Mailbox providers
Divide domains by their user policy:
Admission-controlled domains
most employers Bonded domains Membership domains
e.g., credit card Open, rate-limited
domains Open domains
Kumar Srivastava, Henning Schulzrinne, “Preventing Spam for SIP-based Instant Messaging and Sessions”, Columbia University Technical Report, September 2004.
Reputation and domain Reputation and domain descriptionsdescriptions Need to define mechanism to obtain
domain user verification policy Individual user reputation:
deposit positive or negative feedback information based on calls
depends on cooperation of domain limit user feedback rate to avoid ballot-
stuffing Fortunately, there seem to be few part-
time spammers
Using social networks for Using social networks for spam controlspam control
is a friend of
strength of knowledge = 0.3trust in good behavior = 0.5
total trust = ∑ (strength * trust)
Privacy: ContextPrivacy: Context context = “the interrelated conditions
in which something exists or occurs” anything known about the
participants in the (potential) communication relationship
both at caller and calleetime CPLcapabilities caller preferenceslocation location-based call routing
location eventsactivity/availability presencesensor data (mood, bio)
not yet, but similar in many aspects to location data
Architectures for (geo) Architectures for (geo) information accessinformation access Claim: all using protocols
fall into one of these categories
Presence or event notification “circuit-switched” model subscription: binary
decision Messaging
email, SMS basically, event
notification without (explicit) subscription
but often out-of-band subscription (mailing list)
Request-response RPC, HTTP; also DNS,
LDAP typically, already has
session-level access control (if any at all)
Presence is superset of other two
GEOPRIV IETF working group looking generically at location services (privacy)
SIMPLE and SIP: event notification, presence
GEOPRIV and SIMPLE GEOPRIV and SIMPLE architecturesarchitectures
target locationserver
locationrecipient
rulemaker
presentity
caller
presenceagent watcher
callee
GEOPRIV
SIPpresence
SIPcall
PUBLISHNOTIFY
SUBSCRIBE
INVITE
publicationinterface
notificationinterface
ruleinterface
INVITE
GEOPRIV and SIMPLE Policy GEOPRIV and SIMPLE Policy rulesrules There is no sharp geospatial boundary Discussed in both GEOPRIV (geospatial)
and SIMPLE (SIP IM) Presence contains other sensitive data
(activity, icons, …) and others may be added
Example: future extensions to personal medical data “only my cardiologist may see heart rate, but
notify everybody in building if heart rate = 0” Thus, generic policies are necessary
Presence/Event Presence/Event notificationnotification Three places for policy enforcement
subscription binary only policy, no geo information subscriber may provide filter could reject
based on filter (“sorry, you only get county-level information”) greatly improves scaling since no event-level checks needed
notification content filtering, suppression only policy, no geo information
third-party notification e.g., event aggregator can convert models: gateway subscribes to
event source, distributes by email both policy and geo data
Presence policyPresence policy
subscriptionpolicy
event generatorpolicy
subscriberfilter
rate limiter
change to previousnotification?
for eachwatcher
subscriber (watcher)
SUBSCRIBE
NOTIFY
XML rulesmanaged via XCAP
Policy relationshipsPolicy relationships
geopriv-specific presence-specific
common policy
RPID CIPID
future
PIDF-LO (location object)PIDF-LO (location object) Basic location
object civic and geospatial typically, in
conjunction with presence
contains source and authority
basic privacy rules:
retention period redistribution
allowed
?xml version="1.0" encoding="UTF-8"?> <presence xmlns="urn:ietf:params:xml:ns:pidf" xmlns:gp="urn:ietf:params:xml:ns:pidf:geopriv10" xmlns:gml="urn:opengis:specification:gml:schema-xsd:feature:v3.0" entity="pres:[email protected]"> <tuple id="sg89ae"> <status> <gp:geopriv> <gp:location-info> <gml:location> <gml:Point gml:id="point1" srsName="epsg:4326"> <gml:coordinates>37:46:30N 122:25:10W</gml:coordinates> </gml:Point> </gml:location> </gp:location-info> <gp:usage-rules> <gp:retransmission-allowed>no</gp:retransmission-allowed> <gp:retention-expiry>2003-06-23T04:57:29Z</gp:retention-expiry> </gp:usage-rules> </gp:geopriv> </status> <timestamp>2003-06-22T20:57:29Z</timestamp> </tuple> </presence>
Privacy rule setsPrivacy rule sets Conditions such
as… identity of
requestor time-of-day sphere
Actions e.g., allow
subscription Transformation
e.g., reduce accuracy of geo data
<rule id="f3g44r1"> <conditions> <identity> <uri>[email protected]</uri> </identity> <validity> <from>2003-12-24T17:00:00+01:00</from> <to>2003-12-24T19:00:00+01:00</to> </validity> </conditions> <actions></actions> </rule>
ConclusionConclusion Protocol and technical means as a complement to
legal actions Identity-based techniques more promising than
content-based approaches New applications (VoIP, IM, presence) vulnerable
to unsolicited communications with possibly larger impact due to lower cost, legal
barriers content-based techniques fail altogether
New applications do not lend themselves to current content-based spam prevention techniques
Domain-based rather than person-based mechanisms appear promising
Need policy languages for sharing private data