Prediction io–final 2014-jp-handout
-
Upload
ha-phuong -
Category
Technology
-
view
405 -
download
2
Transcript of Prediction io–final 2014-jp-handout
Yuki Furuta Naoto Yamamoto Tran Hoan
Facebook Open Academy International
What is ?An open source machine learning server
For software developers to create predictive features in their web and mobile app.
Currently powering thousands of developers and hundred of applications1
1 http://github.com/PredictionIO/
1
Architecture
Horizontally scalability
Spark
Data Preparator
Model 1
Model N
HBaseQuery
PredictionResultData
SourceImport Data(EventServer)
Algorithm 1
Algorithm N
ServingHDFS
Spark
.
.
.
http://docs.prediction.io/resources/systems/
Web AppMobile App
Productivity
Data In Data Out
2
What can do?
Content-based recommendationTrend detection Sentiment Analysis
Restaurant recommendation User similarity
Data analysisEngine
(recommendation, rank,…)
YELPIO-NAVI3
MovieLens
YELPIO-NAVI
Naoto Yamamoto Tran Hoan
Recommendation App for RestaurantsUsing Yelp! Dataset
Inhwan Eric Lee(JP)(JP)(USA)
What is YELPIO-NAVIYelp:
食べログ in America
Information ofrestaurants’ address, stars
users’ stars
Recommendation of RestaurantsUsing These Information
5
YELPIO-NAVI Demo Setup
Batch import datathrough RubySDK
Store & RetrieveBusiness Data
Retrieve & StoreBusiness Data
through REST API
Retrieve Prediction Results through REST API
https://github.com/OminiaVincit/predictionio_rails
http://yelpio.hongo.wide.ad.jp/
https://github.com/OminiaVincit/YELPIO_demo2
(1) Neighbourhood model(2) Collaborative Filtering
http://www.yelp.com/
6
YELPIO-NAVI Demo
http://yelpio.hongo.wide.ad.jp/
7 http://zorovn.hongo.wide.ad.jp/
MovieLensContent-based Movie Recommendation
Yuki FurutaNhu-Quynh Beth Yue ShiShaocong Mo(JP)(USA) (USA) (USA)
x MovieLens- Content-Based Movie Recommendation Engine -
A B
A. Collaborative Filtering
9
x MovieLens
MovieLens Datasets• 100,000 ratings (1-5)
from 943 users on 1682 movies
• Simple demographic info for the users (age, gender, occupation, zip)
• Information about the movies (title, release date, genre)
- Content-Based Movie Recommendation Engine -B. Content-Based
A (age: 20, male, RUS) B (age: 21, male, KZH)
20-year-old man likes:• Action 60%• Comedy 10%• English 10%• etc.
10
x MovieLens- Content-Based Movie Recommendation Engine -
Datasetval DataSourceAttributeNames = AttributeNames( user = "pio_user", item = "pio_item", u2iActions = Set("rate"), itypes = "pio_itypes", starttime = "pio_starttime", endtime = "pio_endtime", inactive = "pio_inactive", rating = "pio_rating")
Feature Based
User Based
Algorithms
PreparationReading DataQuery
Serve
MovieLens - User (ID, Age, Gender, Occupation, Zip) - Movie (ID, Title, Year, Genre, Actors,…)
Prepare Train
11
x MovieLens- Content-Based Movie Recommendation Engine -
Stanlay KubricksAmericaComedy
BlackSF
Rowan AtkinsonUnited Kingdom
ComedySF
Action
Feature Based Algorithm
Michael
12
x MovieLens- Content-Based Movie Recommendation Engine -
Stanlay KubricksUSA
ComedyBlack
ScientificFantasy
Rowan AtkinsonUnited Kingdom
ComedySF
ActionFantasy
ComedyFantasyActionUSA
Mark WahlbergUSA
ComedyFantasyAction
Recommend!
Feature Based Algorithm
Michael
13
x MovieLens- Content-Based Movie Recommendation Engine -Feature Based Algorithm
UserID: 1, Age: 24, Gender: M, Occupation: technician, Zip: 85711 UserID: 2, Age: 53, Gender: F, Occupation: other, Zip: 94043 UserID: 3, Age: 23, Gender: M, Occupation: writer, Zip: 32067 UserID: 4, Age: 24, Gender: M, Occupation: technician, Zip: 43537 UserID: 5, Age: 33, Gender: F, Occupation: other, Zip: 15213
User: 196 rates Movie: 242 (3.0 / 5) User: 186 rates Movie: 302 (3.0 / 5) User: 22 rates Movie: 377 (1.0 / 5) User: 244 rates Movie: 51 (2.0 / 5) User: 166 rates Movie: 346 (1.0 / 5)
Threshold (e.g. 2.0)BUY BUY - - -
Train
Querye.g. Recommend 5 movies for UserID: 2 Recommend 5 movies which are “Comedy” for UserID:2 Recommend 2 movies which are “Action” by Rowan Atkinson for UserID: 2
1. MovieID: 297 Score: -8.53295620539528 2. MovieID: 251 Score: -13.326537513274323 3. MovieID: 292 Score: -15.276804370241758 4. MovieID: 290 Score: -32.944167483781335 5. MovieID: 314 Score: -37.45527366828404
Predict
14
…to be continued
Scale for Big Data
Multi-engines & Multi-algorithms
Predict with more features
…
15
Evaluation
Thank you for listening
Japanese team
Yuki Furuta Naoto Yamamoto Tran Hoan