Mining Cross-Domain Rating Datasets from Structured Data on Twitter

download

of 17

  • date post

    23-Aug-2014
  • Category

    Science

  • view

    391
  • download

    5

Embed Size (px)

description

Slides about mining cross-domain ratings presented at the WWW 2014 conference on April 8, in Seoul (Korea) by Simon Dooms.

Transcript of Mining Cross-Domain Rating Datasets from Structured Data on Twitter

  • Mining Cross-Domain Rating Datasets from Structured Data on Twitter @sidooms Simon Dooms
  • Rating Datasets What are ratings? Explicit user preference information Why ratings? Recommender systems ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 2
  • Rating Datasets What are ratings? Explicit user preference information Why ratings? Recommender systems ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 3
  • Ratings Scarcity in Research Ratings = private data Public datasets to the rescue? MovieLens 100K (1998) MovieLens 1M (2000) MovieLens 10M (2008) More on recsyswiki.com Old, Synthetic Datasets ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 4
  • Social Sharing = Ratings Goldmine Previous research: MovieTweetings ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 5
  • Social Sharing = Ratings Goldmine Previous research: MovieTweetings Movie Rating dataset from IMDb Twitter https://github.com/sidooms/MovieTweetings What about other domains? Websites? Well, lets try it out! ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 6
  • Target Websites - Goodreads ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 7 Twitter user - Rating - Book title Book author - Goodreads URL - Time
  • Target Websites - Pandora ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 8 Twitter user - Song Pandora URL - Time
  • Target Websites - YouTube ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 9 Twitter user - (Video uploader) YouTube URL - Time
  • Mining Experiment But words are wind 2 Weeks experiment 4 Online platforms ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 10
  • ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 12 Python code + Task Scheduler = Dataset files https://github.com/sidooms/Twitter-ratings
  • The Numbers One more thing ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 13
  • Cross-Domain Rating Dataset ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 14
  • Applications Collect ratings for recsys research / input Cross-domain recsys research Trend detection, analytics, ... Applicable for all social sharing webs ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 15
  • Conclusions Ratings scarcity in research Public dataset are old and synthetic Social sharing = ratings goldmine 2 week experiment, 4 major websites Python code & datasets on Github True cross-domain ratings dataset ConclusionCross-DomainResultsSocial SharingIntro Apr. 08, 2014 Simon Dooms - Ghent University - MSM 2014 16
  • @sidooms Simon Dooms Mining Cross-Domain Rating Datasets from Structured Data on Twitter