Post on 10-May-2015
description
Recommendation Engine
OutlinesIntroductionObjectivesScopeProblem with existing systemPurpose of new systemProposed architectureTechnologies to be usedModules of systemIntegration of technologiesImplementation Issues to be solvedApplicationFuture Enhancement
ObjectivesInformation Filtering System
Recommendation engine recommends - User based - Item based - Slop based
Run On Cloud Environment
IntroductionEngine - Gives Suggestion Based on
movies,songs,videos,websites,books,images and also social elements.
Applicable for E-business.
Useful for both Customers and online Retailers
Recommendation engine is being used at Amazon, Youtube, Facebook,Twitter
ScopeOur system will only provide Recommendation
service only.
Recommendation will be genrated based on user’s historical activity like purchase pattern as well as rating and like.
Recommendation will be either stored on database ,file or directly retrieved to retailers web application.
Problems with existing System
Take more Time to generate recommendations
No real time recommendation for large data
Purpose of new System Less time for generating recommendations
Applicable for Bigdata
Recommendations be several algorithms User based Item based Slop based Association rule mining
Evaluation of recommendation
Recommendations-TypeUser Based Recommendation
Recommendations-TypeItem Based Recommendation
Proposed System Architecture
Technologies to be usedHadoop
Mahout
Graphlab
Google prediction
Google Storage
Google App engine
Modules of SystemUser Module
Admin Module
Recommendation Module
File management Module
Search Module
Integration of TechnologiesMahout based Recommendation
Graph based Recommendation
Google prediction Based Recommendation
Technology: HADOOPHadoop is a top-level Apache project being built
and used by a global community of contributors.Hadoop project develops open-source software
for reliable, scalable, distributed computing.It enables applications to work with thousands of
nodes and peta bytes of data.Hadoop also support Map/Reduce Algorithm. It provides HDFS file system that stores data
on the compute nodes.
Hadoop
Graphlab It is New Parallel Framework for Machine
Learning Algorithm .Now a day ,Designing and implementing
efficient and correct parallel machine learning (ML) algorithms can be very challenging.
Designed specifically for ML needsAutomatic data synchronization.Map phase like – Update Function .Reduce phase like – Sync Operation .
17
Data GraphShared Data Table
Scheduling
Update Functions and Scopes
GraphLabModel
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
18
Embarrassingly Parallel independent computation
12.9
42.3
21.3
25.8
No Communication needed
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
19
Embarrassingly Parallel independent computation
12.9
42.3
21.3
25.8
24.1
84.3
18.4
84.4
No Communication needed
CPU 1 CPU 2
MapReduce – Reduce Phase
20
12.9
42.3
21.3
25.8
24.1
84.3
18.4
84.4
17.5
67.5
14.9
34.3
2226.
26
1726.
31
Fold/Aggregation
Graphlab in RecommendationGraphlab provide better way in
recommendation engine.Its just first load fits simple dataset file. In graphlab we can also implement various
algortihm like k-means clustering ,fuzzy logic, pagerank and etc.
Its first translated dataset into Matrix form.And then according to different algorithm it
generated recommendated output.
Google Prediction ServiceGoogle cloud service used for Building smart
Application.Having Machine learning Algorithms.Related to Artificial Intelligence.
Google Prediction Service
Google Prediction API : Set of Methods for Data Analysis.Libraries support multiple languages.
Google App Engine :Enable Application to Cloud environment
Application serverGoogle Cloud Storage :
Enable Data to store on Google Cloud database.
Google Prediction Service
Technology : MAHOUT • Apache Mahout is open source project by the
Apache Software Foundation (ASF).• The primary goal of Mahout is creating
scalable machine-learning algorithms.• Several Map-Reduce in Mahout enabled
clustering implementations, including k-Means, fuzzy k-Means, Canopy, Dirichlet, and Mean-Shift.
• Mahout have fix datasets which generally take as data input.
• Amzon EC2 are working with Hadoop and Mahout.
Implementation Issues to solvedLack of knowledge about hadoop,mahout,hiveMemory issueOperating system supportLoad BalancingConfiguration Data normalizationDeveloping Clustering algorithmConfiguring mahout with hadoop
Application of recommendationYahoo!FacebookTwitterBaidueBayLinkedInNew York TimesRackspaceeHarmonyPowerset
Recommendation Engine
Future enhancementIntegration with Web Application like Jsp , Servlet
Integration with Database like Hive, Hbase, Mongodb, Couch db
Cloud based recommendation Service
Integration of Mahout , Graphlab and Google prediction based recommendation services.
Mobile application integration
Thank You