A Field Study of a Visual Controllable Talk Recommender

A Field Study of a Visual Controllable Talk Recommender Denis Parra

School of Information Sciences University of Pittsburgh & PUC Chile

135 N Bellefield Avenue, Pittsburgh, PA 15213 [email protected], [email protected]

Peter Brusilovsky School of Information Sciences

University of Pittsburgh 135 N Bellefield Avenue, Pittsburgh, PA 15213

[email protected]

ABSTRACT In this paper, we contribute to the study of recommender systems from a HCI perspective by investigating the effects upon the user experience of a novel interface which uses a Venn diagram to represent the outputs of an interactive talk recommender system. We present the results of a preliminary user study on talk recommendations in the context of a conference with n=37 people that used our system under one of two conditions: a static list of recommendations, or the enhanced visual controllable interface. The user behavioral analysis and the results of a survey that n=17 users answered provide interesting insights for designers and developers of interfaces for recommender systems, especially when the items can have one or more contexts of relevancy as in a hybrid recommender system.

Categories and Subject Descriptors H.5.2. Information interfaces and presentation (e.g., HCI): User interfaces. H.5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

General Terms Human Factors; Design; Experimentation.

Keywords User interfaces for recommender systems, information visualization, user studies.

1. INTRODUCTION Recommender systems’ research in aspects related to Human Computer Interaction (HCI) is scarce, such as studying the effects of showing the recommendations in a different representation than a ranked list, or investigating richer forms of user interaction than rating items or scrolling a list of them. In this paper we describe and present the results of a field study over users of a conference talk recommender system during a recently held academic conference, ACM CSCW 2013. The main research question we address is to which extent providing users interactive control [1, 2] and a novel visualization that supports inspectability as different depths of field in the form of a Venn diagram affects the user experience with the recommender system.

The rest of the paper is structured as follows. In Section 2 we present recent related work in presentation, interaction and

explainability on recommender systems. Then, in Section 3 we describe our visual and interaction design, we continue in Section 4 presenting our evaluation where we report behavioral analysis and also results of a survey. In Section 5 we discuss the results and limitations of our work. Finally, we present our conclusions and ideas for future work in the Section 6.

2. RELATED WORK The development of interfaces that present recommended items in a model different that a ranked list is rather scarce. Some examples include SFViz [3], a sunburst visualization to allow users finding interest-based in last.fm, and Pharos [4] a social map visualization of latent communities. Other examples that also include a richer user interaction are PeerChooser [5], SmallWorlds[6] – that focus in representing collaborative filtering- and TasteWeights [1], an interactive system that represents a hybrid music recommender. In our previous work, in [7] we proposed the prototype of two different visualizations to present a personalized conference’s program schedule -including recommendations of papers-, a concentric-circle layout and a graph model. In [8] we adapt the existent Aduna visualization to let users explore and find relevant conference talks by analyzing the connections of different entities such as user bookmarks, recommender algorithms and user tags. Through a user study, we found that the effectiveness and probability of item selection both increase when users are able to explore and interrelate multiple entities. We argue that this interface helped users to increase their perception of explainability of the recommendations, a concept that has been already shown to have a positive effect on user satisfaction [9].

3. An Interactive Visual Interface for Talk Recommendation Motivated by our previous results in [8] showing that the intersection of several sources of relevance increases users likelihood to accept a recommendation, and also inspired by the work of [1] indicating the positive effect of user control using sliders, we integrated these elements in the design of our system. Figure 1 (a) shows the whole interface of the hybrid talk recommender, with the Venn diagram presented in more detail in Figure 1 (b). It indicates three big ellipses that represent three different recommendation methods (A-blue, B-yellow and C-red), the small circles represent recommended items, and the areas where ellipses intersect are aimed to help the user at telling which items have been recommended by more than one method. The colors are intended to tell the different recommendation methods:

A. Blue (Author Impact): Talks identified relevant because one or more of their authors have been frequently cited in the ACM Digital Library

B. Yellow (Similar Content): Talks which content is similar to the articles previously identified as relevant

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ChileCHI’13, Month 1–2, 2013, Temuco, Chile. Copyright 2013 ACM 1-58113-000-xxxxxx …$15.00.

(a) (b)

Figure 1. (a) Talk recommender interface. In addition to browsing the list of articles (3), the user can control the importance of each method using sliders (2) and inspect the items using a Venn diagram (1). A detailed view of the Venn diagram is shown in (b).C. Red (Articles of Co-authors): Articles written by the co-

authors of selected authors from previous CSCW and CHI conferences (explained in more detail in the Evaluation section).

3.1 Explainability and User control Explainability has shown a significant effect on user satisfaction in previous studies, though with certain nuances such as the domain expertise [10]. However, based on our previous work [8] we introduced a Venn diagram to let users easily see intersections of relevancy from different methods, as seen in the Figure 1 (b).

User control has shown an important effect on user satisfaction with a recommender system, so building from the ideas presented in [1, 2] we decided to incorporate sliders that allow the user to set the importance of each recommendation method. In Figure 1, (2) shows the sliders that allow the users to change the weight of each method in a range from 0 to 1 with steps of 0.1.

4. EVALUATION To evaluate our system we run a user study with people attending the CSCW 2013 conference in San Antonio, TX. Using the online conference support system Conference Navigator [11] -that allows users to access the conference program, proceedings, paper’s details, and also to schedule the favorite talks- we provided the visual controllable recommender as a new feature.

Participants. 168 people used Conference Navigator during the CSCW conference (this period spans from February 17th of 2013 to March 3rd of 2013), and 37 of those people used the talk recommender.

Procedure. Each user that authenticated into Conference Navigator was randomly assigned to one of two conditions: the non-controllable recommender (a static list of recommendations) and the visual controllable recommender (the interface shown in Figure 1). Each condition had a different promotion message in the home page of the CSCW 2013 conference in Conference Navigator. The users that clicked in the link to use the Talk

Recommendations had to complete 2 steps in order to obtain recommendations: 1) Select their favorite authors of papers from previous CSCW and CHI from a list of the 100 most cited authors of the CSCW and CHI conferences, and, 2) choose the articles authored by these authors in CSCW and CHI conferences, which will be used to generate recommendations. After the second step was completed, users saw the list of recommended talks.

4.1 Results To assess the impact of our visual controllable interface we tracked a specific set of users’ actions under both interfaces. In addition, all users who tried the recommendation functionality received an invitation to answer a brief survey.

4.1.1 Usage and Engagement In total, 168 people created their accounts and used Conference Navigator during the CSCW conference (this period spans from February 17th of 2013 to March 3rd of 2013). Out of the total number, 161 users were exposed to one or another recommendation condition (84 to visual and 77 to regular interface). Working with a recommender component was an option, not a requirement. The only engagement strategy used was the advertising of one or another recommender interface on the system home page. Out of 161 exposed users, 37 used the talk recommender. 22 people used the visual controllable interface and 15 the non-controllable interface. From this group of 37 people, 17 answered a survey to obtain feedback regarding their experience using the talk recommendation.

The engagement data provide some evidence that the visual interface was more attractive for the users. While the number of users exposed to both conditions was about the same, 22 were sufficiently interested in the visual option to go through the three steps needed until they could actually see the recommendations, compared to only 15 that used the interface with a static list of recommendations. The average number of rated or bookmarked talks per active users was also comparable.

Figure 2. Average number of actions per user. The horizontal axis indicates the type of actions. The dark grey bars indicates

actions in the static list interface, and the light grey ones in the visual controllable interface. Further insights are provided by the analysis of user interaction with talks within the interfaces. As the data shows (Table 1), the visual interface engaged more users into the active work with the papers (vs. passive browsing) – 10 users (40%) were rating talks in the visual interface vs. only 3 (25%) in the regular one. The number of users bookmarking the talks was comparable.

Table 1. Usage metrics of each recommender interface: Condition A (visual controllable) and Condition B (static list)

Metric Condition A Condition B # Exposed to recommendations 84 77

# Used recommendations 22 15

# Answered survey 11 6 # Total Bookmarks / avg per user 28 / 4.67 32/ 4 # Users who bookmarked 6 8

# Total Papers rated / avg per user 130 / 13 45 / 15 # Users who rated 10 3 Average user rating 3.73 3.34

4.1.2 Action Analysis To understand the value of the interactive functionality to the users, we analyzed mouse clicks and hovers over different elements of the interface. A good overview of interactive feature usage is provided by Figure 2 that shows the average numbers of actions per user of each type of action

Sliders. In average, users clicked 2.3 times the button “click Update list”, meaning that they tried to update the recommended list. In terms of which method did they prefer by moving the sliders, in average, the users changed the slider of the method C (articles of co-authors) 6 times, compared to 5.5 times of method A (papers of famous authors) and only 2.5 times of method b (content-based recommender). Further log analysis shows that for method A 22% of the times the weight was set to more than 0.5, compared to 80% of slider B and 50% of slider C. Venn Diagram. Users could click over the large ellipses (Venn diagram area) or hover over the circles representing papers of each method or intersection of methods. Users explored in average 6.9 papers in the area B of content-based recommended papers, 4.7 papers of the method C “articles by co-authors” and only 3.1 papers in the area A of papers with high “author impact”. This is only slightly higher than the average number of papers explored in the intersection area of methods BC, which was 2, supporting the larger relevancy of methods B and C, and also the help of the Venn diagram to help identifying the papers recommended by both methods.

4.1.3 Impact The usage data indicates that the visual interface was more engaging to the users in some aspects and that the users were actively using the offered interactive functionality. However, does it offered any advantage to its users? We analyzed two impact factors – the average talk ratings and the average rank of relevant talks. A difference in average user rating might provide some evidence that one interface helps the users to find better talks. As the data shows, the average rating of ranked talks is, indeed, a bit higher in visual controllable interface is larger (mean average rating of 3.73) than in the non-controllable interface (mean average rating 3.34). A difference in the average rank of good talks is evidence that one system can better “push” good talks to the top of the ranked lists (lower ranks) where the talks have more chances to be noticed. In our case, indeed, the average rank of papers rated 4 or 5 was 9.26 for the visual controllable condition and 11.78 for the static list of papers. Unfortunately we had too few users to run a statistical significance test, so both observations provide merely a hint, not reliable evidence in favor of the visual system.

4.1.4 Survey Analysis A survey was conducted among participants of the study. An e-mail was sent to users that utilized the recommender feature in Conference Navigator after the conference had finished inviting them to answer the survey about their experience using the talk recommendation feature. 11 people answered the survey under the controllable condition and 6 under the non-controllable condition. Table 2 shows the questions to which users had to answer how much do they agreed with the statement present.

5. DISCUSSION The evaluation results indicate that the visual controllable recommender had a positive effect on the user experience with the recommender system. Among the different user actions described in the behavioral analysis, users’ hovering the mouse over the circles in the Venn diagram indicates that users have a real interest for exploring recommendations, what is supported by the positive answers to the questions C.4, C.5, and C.6 in the survey.

One result that looks strange is the fact that people were more likely to rate papers than bookmarking them. We explain this behavior by the system failure in leveraging people’s feedback: we did not consider the users’ rating to re-generate recommendations, something that could have affected the user trust in the system since their feedback was not considered immediately to generate new recommendations.

Table 2. 1Results of the post-study survey (*=significant at p<0.05 using Mann-Whitney Test)

Question in both conditions Visual recommender Static recommender 1. The talks recommended matched my interests. 3.64±0.12 3±0.36 2. The talks recommended were diverse. 3.73±0.18 3.33±0.33 8. I would use this recommender system again for another conference in the future. 3.64±0.23 3.33±0.33 10. I do not think that a social conference support system - like Conference Navigator- needs Talk Recommendation functionality 2.36±0.23* 3.5±0.42

11. I understood why the talks were recommended to me. 3.45±0.24 2.83±0.47

Additional questions in the visual controllable recommender condition

C.1. I felt in control of combining different recommendation methods using the sliders. 3.64±0.16 - C.3. The ability to control the recommendation methods increases my trust in the list of recommended talks 3.18±0.2 - C.4. When looking at the list of recommended talks I am interested to examine which recommendation method has been used. 3.9±0.29 - C.5. I think the Venn diagram visualization helped me to understand why a talk was recommended. 3.82±0.28 - C.6. I think the Venn diagram visualization was useful to identify talks recommended by a specific recommendation method or by a combination of methods. 3.72±0.25 -

1 The table shows subset of all the questions. The complete survey is available at http://www.sis.pitt.edu/~dparra/VennRecSurvey.pdf

The survey indicates that people are more likely to think that a talk recommender system is necessary when using the visual controllable interface, what might be a result of being aware of why certain talks are recommended. The average rating of question C.4 “When looking at the list of recommended talks I am interested to …” (3.9) supports the idea that the interface triggers a need for understanding how the talks were recommended, and the average rating of C.5 “I think the Venn diagram visualization helped me to understand why a talk was recommended” (3.82) suggests that the Venn diagram helps to support that need.

6. CONCLUSIONS In this paper we introduced a novel controllable interface to visualize recommended conference talks –beyond the classic model of a ranked list- with enriched user interaction. The results of our field study that include a behavioral analysis and a user survey indicate that the visual controllable recommender interface had a positive effect over the static list representation. Though we cannot make conclusive claims due to the small number of subjects, these results support our hypothesis in a positive direction with respect to larger user laboratory study that we are currently conducting. We expect that our contributions on this area help to fulfill the lack of research at user interface level in recommender systems.

7. REFERENCES [1] Bostandjiev, S., O'Donovan, J. and Höllerer, T.

TasteWeights: a visual interactive hybrid recommender system. In Proceedings the sixth ACM conference on Recommender systems (Dublin, Ireland, 2012).

[2] Knijnenburg, B. P., Bostandjiev, S., O'Donovan, J. and Kobsa, A. Inspectability and control in social recommenders In Proceedings the sixth ACM conference on Recommender systems (Dublin, Ireland, 2012).

[3] Gou, L., You, F., Guo, J., Wu, L. and Zhang, X. SFViz: interest-based friends exploration and recommendation in social

networks. In Proceedings of the 2011 Visual Information Communication - International Symposium (VINCI '11)

[4] Zhao, S., Zhou, M. X., Zhang, X., Yuan, Q., Zheng, W. and Fu, R. Who is Doing What and When: Social Map-Based Recommendation for Content-Centric Social Web Sites. ACM Trans. Intell. Syst. Technol., 3, 1 2011), 1-23.

[5] O'Donovan, J., Smyth, B., Gretarsson, B., Bostandjiev, S. and Höllerer, Tobias PeerChooser: visual interactive recommendation. ACM, 2008.

[6] Gretarsson, B., O'Donovan, J., Bostandjiev, S., Hall, C., Höllerer, T., and llererk. Smallworlds: visualizing social recommendations. In Proceedings of the 12th Eurographics, IEEE - VGTC conference on Visualization (Bordeaux, France, 2010).

[7] Parra, D. Beyond lists: studying the effect of different recommendation visualizations. In Proceedings of the sixth ACM conference on Recommender systems (Dublin, Ireland, 2012). ACM.

[8] Verbert, K., Parra, D., Brusilovsky, P. and Duval, E. Visualizing recommendations to support exploration, transparency and controllability. In Proceedings of the 2013 international conference on Intelligent user interfaces (Santa Monica, California, USA, 2013).

[9] Zhang, J., Jones, N. and Pu, P. A visual interface for critiquing-based recommender systems. In Proceedings of the 9th ACM conference on Electronic commerce (Chicago, Il, USA, 2008). ACM.

[10] Knijnenburg, B. P., Reijmer, N. J. M. and Willemsen, M. C. Each to his own: how different users call for different interaction methods in recommender systems. In Proceedings the fifth ACM conference on Recommender systems (Chicago, USA, 2011).

[11] Parra, D., Jeng, W., Brusilovsky, P., López, C. and Sahebi, S. Conference Navigator 3: An Online Social Conference Support System. Poster at UMAP 2012. Montreal, Canada.

A Field Study of a Visual Controllable Talk Recommender

Documents

Transcript of A Field Study of a Visual Controllable Talk Recommender