On serendipity in recommender systems - Haifa RecSoc workshop june 2015

48
On serendipity in recommender systems Giovanni Semeraro University of Bari Aldo Moro, Italy Advances in Recommender Systems Social and Semantic Aspects RecSoc 2015 Haifa, June16-17, 2015 Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap

Transcript of On serendipity in recommender systems - Haifa RecSoc workshop june 2015

On serendipity in

recommender systems

Giovanni Semeraro

University of Bari Aldo Moro, Italy

Advances in Recommender Systems

Social and Semantic Aspects

RecSoc 2015

Haifa, June16-17, 2015

Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap

Focus: Emotions as implicit feedback for

assessing serendipity of recommendations

Acknowledgments

Marco de

Gemmis

Pasquale

Lops

Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap

Cataldo

Musto Marko

Tkalcic

Serendipity

4

Serendip = “Simhala dvipa” (Sanskrit) the old name of the island

of Ceylon, now Sri Lanka

Outline

Serendipity and Evaluation

Research questions

Operationally induced serendipity:

Knowledge Infusion (KI) process

Item-to-Item correlation matrix

Random Walk with Restart boosted by KI

Experimental evaluation

Noldus FaceReader ™

Dataset

Design of the experiment

Metrics

Questionnaire analysis

Analysis of user emotions

Conclusions

Serendipity in Information Seeking

Information seeking metaphor investigated in literature

(Toms 2000, André et al 2009, Bordino et al. 2013)

Toms suggests 4 strategies

Blind luck or “role of chance” random

Pasteur Principle or “chance favors only the prepared mind”

flashes of insight don’t just happen, but they are the products

of a “prepared mind”

Anomalies and exceptions or “searching for dissimilarities”

identification of items dissimilar to those the user liked in the

past

Reasoning by analogy abstraction mechanism allowing the

system to discover the applicability of an existing schema to a

new situation

(Toms 2000) E. Toms. Serendipitous Information Retrieval. Proc.1st DELOS NoE Workshop on Information Seeking, Searching and Querying

in Digital Libraries, Zurich, Switzerland: ERCIM, 2000.

(André 2009) P. André, J. Teevan, S.T. Dumais. From x-rays to silly putty via Uranus: serendipity and its role in web search. Proc. ACM CHI

2009, ACM, New York, NY, USA, 2009,

(Bordino et al. 2013) I. Bordino, Y. Mejova, M. Lalmas, Penguins in sweaters, or serendipitous entity search on user-generated content.

Proc.22nd ACM CIKM 2013, ACM, New York, NY, USA, 2013, pp. 109–118.

6

Serendipitous recommendations

“Suggestions which help the user to find surprisingly

interesting items she might not have discovered by herself”

(Herlocker et al. 2004)

Both attractive and unexpected

“The experience of receiving an unexpected and fortuitous

item recommendation” (McNee et al. 2006)

“Serendipity involves a positive emotional response of the

user about novel items” (Shani and Gunawardana 2011)

(Herlocker et al. 2004) Herlocker, L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating Collaborative Filtering Recommender Systems. ACM

Transactions on Information Systems 22(1): 5–53, 2004.

(McNee et al. 2006) S.M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender

systems. In CHI ’06 Extended Abstracts on Human Factors in Computing Systems, CHI EA ’06, 1097–1101, ACM, New York, NY, USA, 2006.

(Shani and Gunawardana 2011) G. Shani, A. Gunawardana, Evaluating Recommendation Systems. In F. Ricci, L. Rokach, B. Shapira, P.B.

Kantor (Eds.), Recommender Systems Handbook, Springer, 2011, pp. 257–297.

7

Serendipitous recommendations

A response to the overspecialization problem and the filter

bubble (Pariser 2011)

tendency to provide the user with items within her existing

range of interests

suggesting “STAR TREK” to a science-fiction fan:

Accurate but obvious, thus actually not useful

users don’t want algorithms that produce better ratings, but

sensible recommendations

(Pariser 2011) E. Pariser. The Filter Bubble: What the Internet Is Hiding from You. Penguin Group, May 2011.

Obviousness in recommendations: homophily

The tendency to surround ourselves by like-minded

people [E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.

www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/]

opinions taken to extremes cultural impoverishment

threat for biodiversity?

Homophily in the digital world

in the physical world, one of the strongest sources of homophily is

locality, due to geographic proximity, family ties, and

organizational factors (school, work, etc.)

in the digital world, physical locality is less important. Other

factors, such as common interests, might play a central role

2 main questions:

Are two users more likely to be friends if they share common

interests?

Are two users more likely to share common interests if they are

friends?

In (Lauw et al. 2010), the answer to both questions is

YES

(Lauw et al. 2010) Lauw, H.W., Schafer, J.C., Agrawal, R., & A. Ntoulas. Homophily in the Digital World: A

LiveJournal Case Study. IEEE Internet Computing 14(2):15-23, March-April 2010.

The homophily trap

Does homophily hurt RecSys?

try to tell Amazon that you liked the movie “War

Games”…

The homophily trap

Recommendations by other GEEKS!

“Item-to-Item” homophily…

…Harry Potter for ever?

Serendipity & Search Engines

Poll

Is Personalization A Form Of Censorship?

Yes: 73%

No: 23%

Other: 4%

L. Carr and S. Harnad. Offload Cognition onto the Web. IEEE Intelligent

Systems 26(1): 33-39, 2011.

Evaluation of Serendipity: research questions

Is user’s emotional response

useful for assessing serendipity?

Can emotions observed in facial

expressions be considered as a

trustworthy implicit feedback

for assessing the pleasant surprise

serendipity should convey?

15

Outline

Serendipity and Evaluation

Research questions

Operationally induced serendipity:

Knowledge Infusion (KI) process

Item-to-Item correlation matrix

Random Walk with Restart boosted by KI

Experimental evaluation

Noldus FaceReader ™

Dataset

Design of the experiment

Metrics

Questionnaire analysis

Analysis of user emotions

Conclusions

Operationally induced Serendipity: A Quick Look

at the Recommendation Algorithm

Novel method for computing item

similarity

tries to find “hidden associations” instead of

computing attribute similarity

knowledge intensive process that allows

deeper understanding of item descriptions

Knowledge Infusion (KI)

provides the RecSys with a background

knowledge built from external sources

Content-Based (CB) approach that

exploits the knowledge base to

compute a correlation index between

items

17

Operationally induced Serendipity:

Knowledge Infusion (KI)

Which “words”?

Words that induce positive emotions

Relevant/attractive words able to surprise

the conversation partner

A form of nudging?

18

“Language is the Skin of my Thought”

Arundhati Roy. Power Politics. South End Press, January 2001.

“Words” Recommender System

Recommending Words:

the Architecture of the KI process

sci-fi

conflicts/

fights

KI@Work

CLUE#1

Knowledge

Source #1

Knowledge

Source #2

Knowledge

Source #3

. . .

Knowledge

Source #n

CLUE#2

BACKGROUND KNOWLEDGE

CLUE#3 CLUE#4 CLUE#5

SPREADING ACTIVATION NETWORK

KEYWORD1

KEYWORD2

NEW KEYWORDS ASSOCIATED WITH CLUES

20

G. Semeraro, M. de Gemmis, P. Lops, P. Basile. An Artificial Player for a Language Game. IEEE Intelligent Systems

27(5): 36-43, 2012.

P. Basile, M. de Gemmis, P. Lops, G. Semeraro. Solving a Complex Language Game by using Knowledge-based Word

Associations Discovery. IEEE Transactions on Computational Intelligence and AI in Games, 2015 (in press). DOI:

10.1109/TCIAIG.2014.2355859.

21

KI as a novel method for computing

associations between items

BM25 retrieval score

clues

22

KI as a Serendipity Engine: Item-to-Item similarity

matrix Item-to-Item correlation matrix

wij computed in different

ways

#users co-rated items Ii and I

j

cosine similarity between

descriptions of items Ii and I

j

Knowledge Infusion

Correlation index

Recommendation list

computed by

Random Walk with

Restart (Lovasz 1996)

augmented with

KI (RWR-KI)

(Lovasz 1996) L. Lovasz. Random Walks on Graphs: a Survey. Combinatronics 2:1–46, 1996.

wij

Outline

Serendipity and Evaluation

Research questions

Operationally induced serendipity:

Knowledge Infusion (KI) process

Item-to-Item correlation matrix

Random Walk with Restart boosted by KI

Experimental evaluation

Noldus FaceReader ™

Dataset

Design of the experiment

Metrics

Questionnaire analysis

Analysis of user emotions

Conclusions

Evaluation of Serendipity: research questions

Is user’s emotional response

useful for assessing serendipity?

Can emotions observed in facial

expressions be considered as a

trustworthy implicit feedback

for assessing the pleasant surprise

serendipity should convey?

24

Experimental Evaluation: Goal

25

Validation of the hypothesis that recommendations

produced by RWR-KI are serendipitous

(relevant/attractive & unexpected/surprising)

Not only an issue of metrics!

Difficulty of detecting and providing an objective

assessment of the emotional response conveyed by

serendipitous recommendations

Difficulty of assessing the user perception of

serendipity of recommendations and their acceptance

(in terms of relevance and unexpectedness)

Difficulty of assessing unexpectedness

M. de Gemmis, P. Lops, G. Semeraro, C. Musto. An Investigation on the Serendipity Problem in

Recommender Systems. Information Processing and Management, 2015 (in press) DOI:

10.1016/j.ipm.2015.06.008

Experimental Evaluation

26

2 experiments

In-vitro

User study

In-vitro experiment

Unexpectedness measured as deviation from a

standard prediction criterion (Murakami et al. 2008)

Standard prediction criterion: (non-personalized)

popularity

User study

Analysis performed using Noldus FaceReader™

Allows to analyze users’ facial expressions and gather

implicit feedback about their reactions

(Murakami et al. 2008) T. Murakami, K. Mori, R. Orihara, Metrics for Evaluating the Serendipity of

Recommendation Lists, in K. Satoh, A. Inokuchi, K. Nagao, T. Kawamura (Eds.), New Frontiers in Artificial

Intelligence, Lecture Notes in Computer Science 4914, pp. 40–46, Springer, 2008.

27

Noldus FaceReader™

Recognize basic emotions: 6 categories of

emotions, proposed by Ekman (1999)

happiness

anger

sadness

(Ekman 1999) P. Ekman, Basic Emotions, in T. Dalgleish, M.J. Power (Eds.), Handbook of Cognition and

Emotion, 45–60, John Wiley & Sons, 1999.

fear

disgust

surprise

Basic emotions (Ekman, 1999)

Discrete classes model

Different sets

Darwin (1872) The expression of the emotions in man and

animals

Ekman definition (6 + neutral)

Happiness

Sadness

Fear

Anger

Surprise

Disgust

The problem

• Classification accuracy

~ 90% on Radboud Faces Database (RaFD) (Langner et al.

2010)

(Langner et al. 2010) O. Langner, R. Doetsch, G. Bijlstra, D.H.J. Wigboldus, S.T. Hawk, A. van Knippenberg.

Presentation and Validation of the Radboud Faces Database, Cognition and Emotion 24(8), 1377-1388, 2010.

Experimental Evaluation: Noldus FaceReader™

30

Experimental Evaluation (user study): Dataset

31

Experimental units: 40 master students (engineering,

architecture, economy, computer science and

humanities)

26 male (65%), 14 female (35%)

Age distribution: from 20 to 35

Dataset

2, 135 movies released between 2006 and 2011

Movie content – title, poster, plot keywords, cast, director,

summary – crawled from the Internet Movie Database (IMDb)

Vocabulary of 32, 583 plot keywords

Average: 12.33 keywords/item

Experimental Evaluation (user study): Design of

the experiment

32

Between-subjects controlled experiment

20 users randomly assigned to test RWR-KI

20 users randomly assigned to test RANDOM (control

group), a baseline inspired by the blind luck principle

which produces random suggestions that showed

surprisingly good performance in the 1st In-vitro

experiment

Procedure

Users interact with a web application

– shows details of movies

– displays 5 recommendations (movie poster & title)

per user

Recommended items displayed 1 at a time

Web application

33

Experimental Evaluation (user study): Design of

the experiment

34

Procedure

2 binary questions to assess user acceptance

– “Did you know this movie?”

“Have you ever heard about this movie?” (unexpectedness)

– “Do you like this movie?” (relevance)

– (NO,YES) answers serendipitous recommendation

Video started when a movie is recommended to the user

and stopped when the answers to the 2 questions are

collected

5 videos per user

Noldus FaceReader™ used to analyze videos and assess

user emotional response when exposed to

recommendations

Experimental Evaluation (user study):

Design of the experiment

35

Questionnaire analysis

Quality of RWR-KI and RANDOM

Metrics

Relevance@N = #relevant_items/N

Unexpectedness@N = #unexpected_items/N

Serendipity@N = #serendipitous_items/N

= #(relevant_items unexpected_items)/N

N = size of the recommendation list

Experimental Evaluation (user study): Design of

the experiment

36

Questionnaire analysis

ResQue model (Chen et al. 2010)

– category: Perceived System Qualities

– sub-category: Quality of Recommended Items

– Relevance = perceived accuracy

– Unexpectedness = novelty

(Chen et al. 2010) L. Chen, P. Pu, A User-Centric Evaluation Framework of Recommender Systems, in: B.P. Knijnenburg, L. Schmidt-

Thieme, D. Bollen (Eds.), Proceedings of the ACM RecSys 2010 Workshop on User-Centric Evaluation of Recommender Systems and

Their Interfaces (UCERSTI), CEUR Workshop Proceedings 612, 14-21, CEUR-WS.org, 2010.

Experimental Evaluation (user study): Results

37

Questionnaire analysis

Serendipity: RWR-KI outperforms RANDOM

Statistically significant differences (Mann-Whitney U test,

p<0.05)

~ Half of the recommendations are deemed

serendipitous!

RWR-KI: a better Relevance-Unexpectedness trade-off

RANDOM: more unbalanced towards Unexpectedness

Experimental Evaluation (user study): Results

38

Questionnaire analysis: distribution of serendipitous

items within Top-5 lists

Almost all users (19 out of 20) received 1 serendipitous

suggestions

Most of RWR-KI lists: 2-3 serendipitous items

Most of RANDOM lists: 1-2 serendipitous items

Experimental Evaluation (user study): Results

39

Analysis of user emotions

Hypothesis: users’ facial expressions convey a

mixture of emotions that helps to measure the

perception of serendipity of recommendations

Serendipity associated to surprise and happiness

ResQue model: attractiveness

200 videos (40 users x 5 recommendations)

41 videos filtered out (< 5 seconds)

159 videos, FaceReader™ computed the

distribution of detected emotions + duration

(emotions lasting < 1 sec. filtered out)

Circumplex model

Maps basic emotions dimensional model

Arousal

Valence

high

negative positive

low

neutr

al

sadne

ss

fear

disgu

st

surpri

se

joy

anger

Russell, James (1980). "A circumplex model of affect". Journal of Personality and Social Psychology 39:

1161–1178. doi:10.1037/h0077714

Frequency analysis of user emotions associated to

serendipitous suggestions (69 videos=81–12)

Surprise: 17% RWR-KI vs 9% RANDOM

Happiness: 14% RWR-KI vs 9% RANDOM

RWR-KI produces more serendipitous suggestions than

RANDOM! (confirm questionnaires results)

High values of negative emotions (sadness and anger); why?

Experimental Evaluation (user study): Results

41

39 videos

30 videos

Experimental Evaluation (user study): Results

42

Frequency analysis of user emotions associated to

non-serendipitous suggestions (90 videos=119–29)

General decrease of surprise and happiness

High values of negative emotions (sadness and anger), also in

this case

Explanation: Negative emotions due to the fact that users

assumed troubled expressions since they were very

concentrated on the task

39 videos

51 videos

Outline

Serendipity and Evaluation

Research questions

Operationally induced serendipity:

Knowledge Infusion (KI) process

Item-to-Item correlation matrix

Random Walk with Restart boosted by KI

Experimental evaluation

Noldus FaceReader ™

Dataset

Design of the experiment

Metrics

Questionnaire analysis

Analysis of user emotions

Conclusions

Experimental Evaluation (user study):

Conclusions

44

Positive emotions:

marked difference between RWR-KI and RANDOM

Positive emotions:

marked difference between serendipitous and

non-serendipitous recommendations

Agreement between

questionnaires (explicit feedback) &

facial expressions/emotions (implicit feedback)

Emotions can help to assess the actual perception of

serendipity

A step forward to the creation of a ground truth for

evaluation purposes

Thanks…Questions?

Semantic Web Access and Personalization research group http://www.di.uniba.it/~swap

Pierpaolo Basile

Marco de Gemmis

Pasquale Lops

Fedelucio Narducci

Annalina Caputo

Leo Iaquinta

Cataldo Musto

Marco Polignano

Giovanni Semeraro

! איר אין וויןזען (see you in Vienna!)

9th ACM Conference on Recommender Systems

Vienna, Austria

16th-20th September 2015

References

(André 2009) P. André, J. Teevan, S.T. Dumais. From x-rays to silly putty via Uranus: serendipity

and its role in web search. Proc. ACM CHI 2009, ACM, New York, NY, USA, 2009.

(Bordino et al. 2013) I. Bordino, Y. Mejova, M. Lalmas, Penguins in sweaters, or serendipitous entity

search on user-generated content. Proc. 22nd ACM CIKM 2013, ACM, New York, NY, USA,

2013, pp. 109–118.

(Basile et al. 2014) P. Basile, M. de Gemmis, P. Lops, G. Semeraro. Solving a Complex Language

Game by using Knowledge-based Word Associations Discovery. IEEE Transactions on

Computational Intelligence and AI in Games, 2015 (in press). DOI:

10.1109/TCIAIG.2014.2355859.

(Chen et al. 2010) L. Chen, P. Pu, A User-Centric Evaluation Framework of Recommender Systems,

in: B.P. Knijnenburg, L. Schmidt-Thieme, D. Bollen (Eds.), Proceedings of the ACM RecSys 2010

Workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces (UCERSTI),

CEUR Workshop Proceedings 612, 14-21, CEUR-WS.org, 2010.

(de Gemmis et al. 2014) M. de Gemmis, P. Lops, G. Semeraro, C. Musto. An Investigation on the

Serendipity Problem in Recommender Systems. Information Processing and Management (in

press). DOI: 10.1016/j.ipm.2015.06.008.

(Ekman 1999) P. Ekman, Basic Emotions, in T. Dalgleish, M.J. Power (Eds.), Handbook of Cognition

and Emotion, 45–60, John Wiley & Sons, 1999.

(Herlocker et al. 2004) Herlocker, L., Konstan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating

Collaborative Filtering Recommender Systems. ACM Transactions on Information Systems

22(1): 5–53, 2004.

(Kramer et al. 2014) Kramer, Adam D. I.; Guillory, Jamie E.; Hancock, Jeffrey T. Experimental

evidence of massive-scale emotional contagion through social networks. Proceedings of the

National Academy of Sciences of the United States of America, vol. 11, issue 29, 8788-8790,

2014.

(Langner et al. 2010) O. Langner, R. Doetsch, G. Bijlstra, D.H.J. Wigboldus, S.T. Hawk, A. van

Knippenberg. Presentation and Validation of the Radboud Faces Database, Cognition and

Emotion 24(8), 1377-1388, 2010.

References

(Lauw et al. 2010) Lauw, H.W., Schafer, J.C., Agrawal, R., & A. Ntoulas. Homophily in the Digital

World: A LiveJournal Case Study. IEEE Internet Computing 14(2):15-23, March-April 2010.

(Lovasz 1996) L. Lovasz. Random Walks on Graphs: a Survey. Combinatronics 2:1–46, 1996.

(McNee et al. 2006) S. M. McNee, J. Riedl, and J. A. Konstan. Being accurate is not enough: How

accuracy metrics have hurt recommender systems. In CHI ’06 Extended Abstracts on Human

Factors in Computing Systems, CHI EA ’06, pages 1097–1101, ACM, New York, NY, USA,

2006.

(Murakami et al. 2008) T. Murakami, K. Mori, R. Orihara, Metrics for Evaluating the Serendipity of

Recommendation Lists, in K. Satoh, A. Inokuchi, K. Nagao, T. Kawamura (Eds.), New Frontiers

in Artificial Intelligence, Lecture Notes in Computer Science 4914, pp. 40–46, Springer, 2008.

(Pariser 2011) E. Pariser. The Filter Bubble: What the Internet Is Hiding from You. Penguin Group,

May 2011.

(Roy 2001) Arundhati Roy. Power Politics. South End Press, January 2001.

(Russell 1980) Russell, James. A circumplex model of affect. Journal of Personality and Social

Psychology 39: 1161–1178, 1980. doi:10.1037/h0077714

(Semeraro et al. 2012) G. Semeraro, M. de Gemmis, P. Lops, P. Basile. An Artificial Player for a

Language Game. IEEE Intelligent Systems 27(5): 36-43, 2012.

(Shani and Gunawardana 2011) G. Shani, A. Gunawardana, Evaluating Recommendation Systems.

In F. Ricci, L. Rokach, B. Shapira, P.B. Kantor (Eds.), Recommender Systems Handbook,

Springer, 2011, pp. 257–297.

(Toms 2000) E. Toms. Serendipitous Information Retrieval. Proc.1st DELOS NoE Workshop on

Information Seeking, Searching and Querying in Digital Libraries, Zurich, Switzerland: ERCIM,

2000.

(Zuckerman 2008) E. Zuckerman. Homophily, serendipity, xenophilia. April 25, 2008.

www.ethanzuckerman.com/blog/2008/04/25/homophily-serendipity-xenophilia/