Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of...
-
Upload
brook-davidson -
Category
Documents
-
view
213 -
download
1
Transcript of Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of...
Personalization and Privacy
Alfred Kobsa
School of Information and Computer ScienceUniversity of California, Irvine
Personalization is widespread on today’s World Wide Web
64%
48%
48%
23%
23%
23%
20%
16%
11%
9%
7%
5%News clipping services
Personalized content through non-PC devices
Custom pricing
Targeted marketing/advertising
Express transactions
Saved links
Product recommendations
Wish lists
Personal productivity tools
Account access
Customized content
Tailored email alerts
Percent of 44 companies interviewed (multiple responses accepted)
Source: Forrester Research
Personalization in the future
• Web courses that tailor their teaching strategy to each individual student
• Information and recommendations by portable devices that consider users’ location and habits
• Product descriptions whose complexity is geared towards the presumed level of user expertise
• Tailored presentations that take into account the user’s preferences regarding product presentation and media types (text, graphics, video)
• Recommendations that are based on recognized interests and goals of the user
Current personalization methods(in 30 seconds)
Data sources • Explicit user input• User interaction logs
Methods• Assignment to user groups• Rule-based inferences• Machine learning
Web personalization delivers benefits for both users and web vendors
Jupiter Communications, 1998: Personalization at 25 consumer e-commerce sites increased the number of new customers by 47% in the first year, and revenues by 52%.
Nielsen NetRatings, 1999: • Registered visitors to portal sites spend over 3 times longer at their home
portal than other users, and view 3 to 4 times more pages at their portal• E-commerce sites offering personalized services convert significantly more
visitors into buyers than those that don’t.
Gartner Group, 1999: By 2003, nearly 85 percent of global 1,000 Web sites will use some form of personalization (0.7 probability)Jupiter Research, 2003:
• Personalized sites cost four times more than regular sites• Few users indicate that personalization would make them buy more often
Downside: Personalized sites collect significantly more personal data than regular websites, and do this often in a very inconspicuous manner.
Privacy and Personalization
User trust
Willingness to disclose
personal dataQuality of
PersonalizationUser Benefits (short-term)
Privacy concerns
Length and frequency of interaction
(Verified) Privacy Agreements
(Enforced) Privacy Laws
(Guaranteed) Anonymity
+
+
+
+
+
+
–
User Control
Understanding
++
Myinfosource: An online subscription place for movies, ebooks and news articles
Knows the user• Knows what the user downloaded in the past few years, and for how long she
viewed it• Made assumptions about the user’s interests and expertise• Knows many additional facts which the user provided about himself
Provides personalized services• Makes recommendations for books and movies based on what he downloaded
in the past.• Alerts users by email to new releases or current world events that are
presumably interesting for her• Discourages users from downloading material that is not suitable for him
Is designed for privacy• Myinfosource will never know the identity of the user• No-one will ever know that the user ever visited Myinfosource• Communication with Myinfosource is secure
(Optional: backdoor for law enforcement purposes)
Privacy through Pseudonymity in User-Adaptive Systems
Guarantee privacy in user-adaptive systems through pseudonymity whilst fully
preserving personalized interaction.
Desired properties (ISO 15-408-2, Pfitzmann & Köhntopp [Hansen] 2001)
• [latently] unidentifiable: neither the personalized system nor third parties can determine the identity of pseudonymous users
• linkable for the user-adaptive system: the personalized system can link every interaction step of a user, even across sessions (users maintain a persistent identity)
• unlinkable for third parties: third parties cannot link two interaction steps of the same user
• unobservable for third parties: the usage of a personalized application by a user should not be recognizable by third parties
• Secure transport• Anonymization of the traffic between
– user <--> application– application <--> user model (“wallet”)
• Role-based control for access to user model
Kobsa and Schreck (2003),ACM Tr. on Internet Technology
An Architecture for Privacy through Pseudonymity in User-Adaptive Systems
Will anonymous interaction be deployed?
– Few readily-available distributed anonymization infrastructures (such as mixes) have as yet been put in place
– Anonymous interaction is currently difficult to maintain when payments, physical goods and non-electronic services are being exchanged
– Anonymity on the Internet may harbor the risk of misuse and currently even seems to have an air of disreputability
– Web retailers also have a considerable interest in identified customer data as a business asset (side income, customer segmentation, cross-channel CRM)
+ Regulatory provisions that mandate anonymous and pseudonymous access to electronic services (Germany, European Commission)
+ Articulated consumer demand which gives businesses that offer personalized anonymous interaction a competitive advantage which outweighs its commercial downsides
Addressing privacy constraintsof non-anonymous interaction
If the interaction with the user is carried out in a non-anonymous manner, then the system needs to cater to• the privacy preferences of the user
(variable within a user session). • the laws and privacy policies that apply to the system provider
(largely constant)• the legislation in the jurisdiction of the user (variable per user)
A flexible architecture is needed that allows a system to use that personalization method that• meets the current privacy constraints, and• is the best method for the current personalization purposes
( preference order)
• Usage logs must be deleted after each session• Usage logs of different services may not be combined• User profiles are permissible only if pseudonyms are used.
(Any user profile covered by a pseudonym may not be combined with data concerning the holder of the pseudonym.)
• No fully automated individual decisions are allowed that produce legal effects concerning the data subject or significantly affect him and which are based solely on automated processing of data intended to evaluate certain personal aspects relating to him, such as his performance at work, creditworthiness, reliability, conduct, etc.
• Anonymous or pseudonymous access and payment must be offered if technically possible and reasonable.
Provisions in privacy laws that have impactson traditional user modeling methods
Kobsa (2001), Proc. UM01, Springer
Redundant Component Array(RAIC; Liu & Richardson 2002)
Component A Component B
Component X Component Y Component Z
Controller
RAIC-based architecturefor personalized systems
• Personalized systems need to be based on RAICs whose components are functionally inclusionary or at least similar
• At any given point, the services of the RAIC are delivered by that component which– meets the current privacy constraints, AND
– is ranked highest in a (static) activation preference order
Kobsa (2003), Proc. PET03, Springer
Example: web store gives personalized purchase recommendations to visitors
RAIC-based architecture, with the following components for makingrecommendations about things to buy based on different data:
A: user’s demographic data (age, gender, profession, ZIP), by drawing conclusions based on market segmentation data
B: user’s page visits (during the current session only), using “quick” one-time machine learning methods;
C: user’s demographic data and page visits (in the current session only), using a combination of the methods in A and B
D: user’s page visits during several sessions, using “slow” learning methods which store the user log between sessions
E: user’s demographic data and her page visits during several sessions, using a combination of the methods in A and D (the user trace is again stored between sessions).
Prerequisites for the operationof components A-E
• Availability of data (A-E)• User consent required by privacy laws
A-E: consent to the processing of personal data
C,E: consent to profiling
D,E: consent to preservation of use data beyond session
• Self-regulatory privacy principles C,E: consent required by NAI
• Individual privacy preferencesB-E: consent to “being watched”
Privacy and Personalization
User trust
Willingness to disclose
personal dataQuality of
PersonalizationUser
Benefits
Privacy concerns
Length and frequency of interaction
(Verified) Privacy Agreements
(Enforced) Privacy Laws
(Guaranteed) Anonymity
+
+
+
+
+
+
–
?
?
?
0/- 0
User Control
Understanding
++
Understanding
User Control
Current specification of privacy preferences
Currently, users have to specify their privacypreferences
• upfront, without context• without explanation of what data exactly will be
requested, and for what purpose• without explanation of what privacy impacts they
should be concerned about• without explanation of what they miss out if they do
not provide the requested data
Inconsistencies between stated privacy preferences when prompted upfront and out-of-context, and actual usage behavior in a concrete context.
Increasing users’ control and understanding of privacy
How may the siteuse personal data?
… and what are the resulting personalization benefits?