Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of...

19
Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine

Transcript of Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of...

Page 1: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Personalization and Privacy

Alfred Kobsa

School of Information and Computer ScienceUniversity of California, Irvine

Page 2: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Personalization is widespread on today’s World Wide Web

64%

48%

48%

23%

23%

23%

20%

16%

11%

9%

7%

5%News clipping services

Personalized content through non-PC devices

Custom pricing

Targeted marketing/advertising

Express transactions

Saved links

Product recommendations

Wish lists

Personal productivity tools

Account access

Customized content

Tailored email alerts

Percent of 44 companies interviewed (multiple responses accepted)

Source: Forrester Research

Page 3: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Personalization in the future

• Web courses that tailor their teaching strategy to each individual student

• Information and recommendations by portable devices that consider users’ location and habits

• Product descriptions whose complexity is geared towards the presumed level of user expertise

• Tailored presentations that take into account the user’s preferences regarding product presentation and media types (text, graphics, video)

• Recommendations that are based on recognized interests and goals of the user

Page 4: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Current personalization methods(in 30 seconds)

Data sources • Explicit user input• User interaction logs

Methods• Assignment to user groups• Rule-based inferences• Machine learning

Page 5: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Web personalization delivers benefits for both users and web vendors

Jupiter Communications, 1998: Personalization at 25 consumer e-commerce sites increased the number of new customers by 47% in the first year, and revenues by 52%.

Nielsen NetRatings, 1999: • Registered visitors to portal sites spend over 3 times longer at their home

portal than other users, and view 3 to 4 times more pages at their portal• E-commerce sites offering personalized services convert significantly more

visitors into buyers than those that don’t.

Gartner Group, 1999: By 2003, nearly 85 percent of global 1,000 Web sites will use some form of personalization (0.7 probability)Jupiter Research, 2003:

• Personalized sites cost four times more than regular sites• Few users indicate that personalization would make them buy more often

Downside: Personalized sites collect significantly more personal data than regular websites, and do this often in a very inconspicuous manner.

Page 6: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Privacy and Personalization

User trust

Willingness to disclose

personal dataQuality of

PersonalizationUser Benefits (short-term)

Privacy concerns

Length and frequency of interaction

(Verified) Privacy Agreements

(Enforced) Privacy Laws

(Guaranteed) Anonymity

+

+

+

+

+

+

User Control

Understanding

++

Page 7: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Myinfosource: An online subscription place for movies, ebooks and news articles

Knows the user• Knows what the user downloaded in the past few years, and for how long she

viewed it• Made assumptions about the user’s interests and expertise• Knows many additional facts which the user provided about himself

Provides personalized services• Makes recommendations for books and movies based on what he downloaded

in the past.• Alerts users by email to new releases or current world events that are

presumably interesting for her• Discourages users from downloading material that is not suitable for him

Is designed for privacy• Myinfosource will never know the identity of the user• No-one will ever know that the user ever visited Myinfosource• Communication with Myinfosource is secure

(Optional: backdoor for law enforcement purposes)

Page 8: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Privacy through Pseudonymity in User-Adaptive Systems

Guarantee privacy in user-adaptive systems through pseudonymity whilst fully

preserving personalized interaction.

Desired properties (ISO 15-408-2, Pfitzmann & Köhntopp [Hansen] 2001)

• [latently] unidentifiable: neither the personalized system nor third parties can determine the identity of pseudonymous users

• linkable for the user-adaptive system: the personalized system can link every interaction step of a user, even across sessions (users maintain a persistent identity)

• unlinkable for third parties: third parties cannot link two interaction steps of the same user

• unobservable for third parties: the usage of a personalized application by a user should not be recognizable by third parties

Page 9: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

• Secure transport• Anonymization of the traffic between

– user <--> application– application <--> user model (“wallet”)

• Role-based control for access to user model

Kobsa and Schreck (2003),ACM Tr. on Internet Technology

An Architecture for Privacy through Pseudonymity in User-Adaptive Systems

Page 10: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Will anonymous interaction be deployed?

– Few readily-available distributed anonymization infrastructures (such as mixes) have as yet been put in place

– Anonymous interaction is currently difficult to maintain when payments, physical goods and non-electronic services are being exchanged

– Anonymity on the Internet may harbor the risk of misuse and currently even seems to have an air of disreputability

– Web retailers also have a considerable interest in identified customer data as a business asset (side income, customer segmentation, cross-channel CRM)

+ Regulatory provisions that mandate anonymous and pseudonymous access to electronic services (Germany, European Commission)

+ Articulated consumer demand which gives businesses that offer personalized anonymous interaction a competitive advantage which outweighs its commercial downsides

Page 11: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Addressing privacy constraintsof non-anonymous interaction

If the interaction with the user is carried out in a non-anonymous manner, then the system needs to cater to• the privacy preferences of the user

(variable within a user session). • the laws and privacy policies that apply to the system provider

(largely constant)• the legislation in the jurisdiction of the user (variable per user)

A flexible architecture is needed that allows a system to use that personalization method that• meets the current privacy constraints, and• is the best method for the current personalization purposes

( preference order)

Page 12: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

• Usage logs must be deleted after each session• Usage logs of different services may not be combined• User profiles are permissible only if pseudonyms are used.

(Any user profile covered by a pseudonym may not be combined with data concerning the holder of the pseudonym.)

• No fully automated individual decisions are allowed that produce legal effects concerning the data subject or significantly affect him and which are based solely on automated processing of data intended to evaluate certain personal aspects relating to him, such as his performance at work, creditworthiness, reliability, conduct, etc.

• Anonymous or pseudonymous access and payment must be offered if technically possible and reasonable.

Provisions in privacy laws that have impactson traditional user modeling methods

Kobsa (2001), Proc. UM01, Springer

Page 13: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Redundant Component Array(RAIC; Liu & Richardson 2002)

         Component A             Component B   

         Component X            Component Y        Component Z  

Controller

Page 14: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

RAIC-based architecturefor personalized systems

• Personalized systems need to be based on RAICs whose components are functionally inclusionary or at least similar

• At any given point, the services of the RAIC are delivered by that component which– meets the current privacy constraints, AND

– is ranked highest in a (static) activation preference order

Kobsa (2003), Proc. PET03, Springer

Page 15: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Example: web store gives personalized purchase recommendations to visitors

RAIC-based architecture, with the following components for makingrecommendations about things to buy based on different data:

A: user’s demographic data (age, gender, profession, ZIP), by drawing conclusions based on market segmentation data

B: user’s page visits (during the current session only), using “quick” one-time machine learning methods;

C: user’s demographic data and page visits (in the current session only), using a combination of the methods in A and B

D: user’s page visits during several sessions, using “slow” learning methods which store the user log between sessions

E: user’s demographic data and her page visits during several sessions, using a combination of the methods in A and D (the user trace is again stored between sessions).

Page 16: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Prerequisites for the operationof components A-E

• Availability of data (A-E)• User consent required by privacy laws

A-E: consent to the processing of personal data

C,E: consent to profiling

D,E: consent to preservation of use data beyond session

• Self-regulatory privacy principles C,E: consent required by NAI

• Individual privacy preferencesB-E: consent to “being watched”

Page 17: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Privacy and Personalization

User trust

Willingness to disclose

personal dataQuality of

PersonalizationUser

Benefits

Privacy concerns

Length and frequency of interaction

(Verified) Privacy Agreements

(Enforced) Privacy Laws

(Guaranteed) Anonymity

+

+

+

+

+

+

?

?

?

0/- 0

User Control

Understanding

++

Understanding

User Control

Page 18: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Current specification of privacy preferences

Currently, users have to specify their privacypreferences

• upfront, without context• without explanation of what data exactly will be

requested, and for what purpose• without explanation of what privacy impacts they

should be concerned about• without explanation of what they miss out if they do

not provide the requested data

Inconsistencies between stated privacy preferences when prompted upfront and out-of-context, and actual usage behavior in a concrete context.

Page 19: Personalization and Privacy Alfred Kobsa School of Information and Computer Science University of California, Irvine.

Increasing users’ control and understanding of privacy

How may the siteuse personal data?

… and what are the resulting personalization benefits?