CLEF NewsREEL 2016 Overview

Post on 16-Apr-2017

363 views 2 download

Transcript of CLEF NewsREEL 2016 Overview

News REcommendation Evaluation Lab (NewsREEL)

Lab Overview

Frank Hopfgartner, Benjamin Kille, Andreas Lommatzsch, Martha Larson, Torben Brodt, Jonas Seiler

Recommender systems or recommendation systems are a subclass of information filtering systems that seek to predict the "rating" or "preference“ that a user would give to an item.

Recommender Systems

Items (Set-based recommenders)

Items (Streams)

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

How do we evaluate?

Academia Industry

• Static, often rather old datasets• Offline Evaluation• Focus on Algorithms and Precision

• Dynamic dataset• Online A/B testing• Focus on user satisfaction

Example: Recommending sites in Evora

Sé Catedral

Capela dos Ossos

Templo romano

Palacio de Don Manuel I

Source (Images): Wikipedia

1.Chose time point to split dataset 2.Classify ratings before t0 as training set 3.Classify ratings after as test set

Offline Evaluation Dataset construction

Centro Historico

Capela dos Ossos

Templo Romano

Se Catedral

Almendres Cromlech

Cristiano 2 4 5 2

Marta 5 3

Luis 1 2

1.Train rating function f(u,i) using training set2.Predict rating for all pairs (u,i) of test set3.Compute RMSE(f) over all rating predictions

Offline Evaluation Benchmarking Recommendation Task

• Ignores users’ previous interactions/preferences

• Does not consider trends/shifting preferences

• ...

• Technical challenges are ignored

Drawbacks of offline evaluation

And what about me?

Example: Online evaluation

Online Evaluation (A/B testing)

Compare performance, e.g., based on profit margins

$$$

A

BClick-through rateUser retention timeRequired resources User satisfaction

• Large user base and costly infrastructure required

• Different evaluation metrics required

• Comparison to offline evaluation challenging

Drawbacks of online evaluation

And what about me?

• Academia and industry apply different evaluation approaches• Limited transfer from offline to online scenario• Multi-dimensional benchmarking• Combination of different evaluation approaches

Evaluation challenges

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

In CLEF NewsREEL, participants can develop stream-based news recommendation algorithms and have them benchmarked (a) online by millions of users over the period of a few months, and (b) offline by

simulating a live stream.

CLEF NewsREEL

NewsREEL scenario

Image: Courtesy of T. Brodt (plista)

NewsREEL scenario

Profit = Clicks on recommendationsBenchmarking metric: Click-Through-Rate

Request article

Request article

Request recommendation

Request recommendation

Task 2: Offline Evaluation

• Traffic and content updates of nine German-language news content provider websites

• Traffic: Reading article, clicking on recommendations

• Updates: adding and updating news articlesD

ata

• Simulation of data stream using Idomaar framework

• Participants have to predict interactions with data stream

• Quality measured by the ratio of successful predictions by the total number of predictionsE

valu

atio

n

Predict interactions in a simulated data stream

Simulation process

Idomaar

simulate stream

request article

Idomaar stream simulation

21

Task 1: Online Evaluation

• Provide recommendations for visitors of the news portals of plista’s customers

• Ten portals (local news, sports, business, technology)

• Communication via Open Recommendation Platform (ORP)

Dat

a

• Benchmark own performance with other participants and baseline algorithms during three pre-defined evaluation windows

• Best algorithms determined in final evaluation period

• Standard evaluation metricsEva

luat

ion

Recommend news articles in real-time

Real-Time Recommendation

T. Brodt and F. Hopfgartner“Shedding Light on a Living Lab: The CLEF NewsREEL Open Recommendation Platform,” In Proc. of IIiX 2014, Regensburg, Germany, pp. 223-226, 2014.

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

Participation

25

Task 1 – First evaluation window

Task 1 – Second evaluation window

Task 1 – Third Evaluation window

• NewsREEL presentations – Online Algorithms and Data

Analysis– Frameworks and Algorithms

• Evaluation results and task winners

• Joint Session: LL4IR & NewsREEL: New Ideas

NewsREEL sessionTomorrow, 1:30pm – 3:30pm

More Information• http://orp.plista.com• http://www.clef-newsreel.org• http://www.crowdrec.eu• http://sigir.org/files/forum/2015D/p129.pdf

Thank you

Questions?