Prediction io–final 2014-jp-handout

17
Yuki Furuta Naoto Yamamoto Tran Hoan Facebook Open Academy International

Transcript of Prediction io–final 2014-jp-handout

Page 1: Prediction io–final 2014-jp-handout

Yuki Furuta Naoto Yamamoto Tran Hoan

Facebook Open Academy International

Page 2: Prediction io–final 2014-jp-handout

What is ?An open source machine learning server

For software developers to create predictive features in their web and mobile app.

Currently powering thousands of developers and hundred of applications1

1 http://github.com/PredictionIO/

1

Page 3: Prediction io–final 2014-jp-handout

Architecture

Horizontally scalability

Spark

Data Preparator

Model 1

Model N

HBaseQuery

PredictionResultData

SourceImport Data(EventServer)

Algorithm 1

Algorithm N

ServingHDFS

Spark

.

.

.

http://docs.prediction.io/resources/systems/

Web AppMobile App

Productivity

Data In Data Out

2

Page 4: Prediction io–final 2014-jp-handout

What can do?

Content-based recommendationTrend detection Sentiment Analysis

Restaurant recommendation User similarity

Data analysisEngine

(recommendation, rank,…)

YELPIO-NAVI3

MovieLens

Page 5: Prediction io–final 2014-jp-handout

YELPIO-NAVI

Naoto Yamamoto Tran Hoan

Recommendation App for RestaurantsUsing Yelp! Dataset

Inhwan Eric Lee(JP)(JP)(USA)

Page 6: Prediction io–final 2014-jp-handout

What is YELPIO-NAVIYelp:

食べログ in America

Information ofrestaurants’ address, stars

users’ stars

Recommendation of RestaurantsUsing These Information

Page 7: Prediction io–final 2014-jp-handout

YELPIO-NAVI Demo Setup

Batch import datathrough RubySDK

Store & RetrieveBusiness Data

Retrieve & StoreBusiness Data

through REST API

Retrieve Prediction Results through REST API

https://github.com/OminiaVincit/predictionio_rails

http://yelpio.hongo.wide.ad.jp/

https://github.com/OminiaVincit/YELPIO_demo2

(1) Neighbourhood model(2) Collaborative Filtering

http://www.yelp.com/

Page 8: Prediction io–final 2014-jp-handout

YELPIO-NAVI Demo

http://yelpio.hongo.wide.ad.jp/

7 http://zorovn.hongo.wide.ad.jp/

Page 9: Prediction io–final 2014-jp-handout

MovieLensContent-based Movie Recommendation

Yuki FurutaNhu-Quynh Beth Yue ShiShaocong Mo(JP)(USA) (USA) (USA)

Page 10: Prediction io–final 2014-jp-handout

x MovieLens- Content-Based Movie Recommendation Engine -

A B

A. Collaborative Filtering

Page 11: Prediction io–final 2014-jp-handout

x MovieLens

MovieLens Datasets• 100,000 ratings (1-5)

from 943 users on 1682 movies

• Simple demographic info for the users (age, gender, occupation, zip)

• Information about the movies (title, release date, genre)

- Content-Based Movie Recommendation Engine -B. Content-Based

A (age: 20, male, RUS) B (age: 21, male, KZH)

20-year-old man likes:• Action 60%• Comedy 10%• English 10%• etc.

10

Page 12: Prediction io–final 2014-jp-handout

x MovieLens- Content-Based Movie Recommendation Engine -

Datasetval DataSourceAttributeNames = AttributeNames( user = "pio_user", item = "pio_item", u2iActions = Set("rate"), itypes = "pio_itypes", starttime = "pio_starttime", endtime = "pio_endtime", inactive = "pio_inactive", rating = "pio_rating")

Feature Based

User Based

Algorithms

PreparationReading DataQuery

Serve

MovieLens - User (ID, Age, Gender, Occupation, Zip) - Movie (ID, Title, Year, Genre, Actors,…)

Prepare Train

11

Page 13: Prediction io–final 2014-jp-handout

x MovieLens- Content-Based Movie Recommendation Engine -

Stanlay KubricksAmericaComedy

BlackSF

Rowan AtkinsonUnited Kingdom

ComedySF

Action

Feature Based Algorithm

Michael

12

Page 14: Prediction io–final 2014-jp-handout

x MovieLens- Content-Based Movie Recommendation Engine -

Stanlay KubricksUSA

ComedyBlack

ScientificFantasy

Rowan AtkinsonUnited Kingdom

ComedySF

ActionFantasy

ComedyFantasyActionUSA

Mark WahlbergUSA

ComedyFantasyAction

Recommend!

Feature Based Algorithm

Michael

13

Page 15: Prediction io–final 2014-jp-handout

x MovieLens- Content-Based Movie Recommendation Engine -Feature Based Algorithm

UserID: 1, Age: 24, Gender: M, Occupation: technician, Zip: 85711 UserID: 2, Age: 53, Gender: F, Occupation: other, Zip: 94043 UserID: 3, Age: 23, Gender: M, Occupation: writer, Zip: 32067 UserID: 4, Age: 24, Gender: M, Occupation: technician, Zip: 43537 UserID: 5, Age: 33, Gender: F, Occupation: other, Zip: 15213

User: 196 rates Movie: 242 (3.0 / 5) User: 186 rates Movie: 302 (3.0 / 5) User: 22 rates Movie: 377 (1.0 / 5) User: 244 rates Movie: 51 (2.0 / 5) User: 166 rates Movie: 346 (1.0 / 5)

Threshold (e.g. 2.0)BUY BUY - - -

Train

Querye.g. Recommend 5 movies for UserID: 2 Recommend 5 movies which are “Comedy” for UserID:2 Recommend 2 movies which are “Action” by Rowan Atkinson for UserID: 2

1. MovieID: 297 Score: -8.53295620539528 2. MovieID: 251 Score: -13.326537513274323 3. MovieID: 292 Score: -15.276804370241758 4. MovieID: 290 Score: -32.944167483781335 5. MovieID: 314 Score: -37.45527366828404

Predict

14

Page 16: Prediction io–final 2014-jp-handout

…to be continued

Scale for Big Data

Multi-engines & Multi-algorithms

Predict with more features

15

Evaluation

Page 17: Prediction io–final 2014-jp-handout

Thank you for listening

Japanese team

Yuki Furuta Naoto Yamamoto Tran Hoan