Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons:...
-
Upload
sydney-blair -
Category
Documents
-
view
218 -
download
1
Transcript of Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons:...
![Page 1: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/1.jpg)
Optimizing Online Yield via Predictive Modeling
of Individual Site Visitors
Magnify360 Liasons:
Olivier Chaine, Jim Healy, Nate Pool,
Gilles ?????
David LapayowkerMarissa Quitt
Elaine Shaver (PM)Devin Smith
HMC Advisor:
Zachary Dodds
![Page 2: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/2.jpg)
Magnify360
Designs multiple websites for clients with each site customized to meet the needs of different types of users.
Analyzes clickstream data from site visitors in order to provide the website that will best suit each one.
The result is to convert a larger set of users than a single page.
old Facebook new Facebook
![Page 3: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/3.jpg)
System OverviewNavigates to a site
serve pageclickstream data
User Actions
Dataflow
Our system
classify user
Musician
Tailored interactions "Conversion"
results
choose page
• user data• pages served• conversion data
Musician
Pachyphile
Bioengineer
Musician
Pasadena resident
InsomniacUser
groups
Online classifier Offline analysis
clustering
![Page 4: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/4.jpg)
Problem StatementNavigates to a site
serve pageclickstream data
User Actions
Dataflow
Our system
classify user
Musician
Tailored interactions "Conversion"
results
choose page
• user data• pages served• conversion data
Musician
Pachyphile
Bioengineer
Musician
Pasadena resident
InsomniacUser
groups
Online classifier Offline analysis
clustering
Detailed problem statement here
![Page 5: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/5.jpg)
Clickstream Dataexample columns…
Database
80 tables 110,000,000 rows 13 GB
ethics ~ anonymous ~ no purchased data!
![Page 6: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/6.jpg)
User profilesA profile is a binary attribute that captures a specific combination of data values.
Currently 42 of them, hand-specified
insomniac something something
Tradeoffs:+ captures experienced intuition about what is important
+ takes advantage of Magnify360's site-design expertise
- binary attributes- may miss patterns not captured by the user profiles
from Mag360's site
![Page 7: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/7.jpg)
Conversion dataThe site yield, or conversion, is client-specified
Amount of transaction(s)
3% conversion
Time spent on (a part of) the site
Contact information
presence and/or time of an email address
table
Goal: to determine those clusters of visitors who will be best served (convert) via a particular version of a client site
![Page 8: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/8.jpg)
Offline analysis ~ user clustering
Visitors ~ vectors of profile
attributes
hand-tuned clusters
decision-tree clustering
fuzzy k-means clustering
support vector machines
one big cluster ~ "best page"
growing neural gas
hierarchical clustering
![Page 9: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/9.jpg)
Offline analysis ~ user clustering
Visitors ~ vectors of profile
attributes
hand-tuned clusters
decision-tree clustering
fuzzy k-means clustering
support vector machines
one big cluster ~ "best page"
growing neural gashierarchical clustering
![Page 10: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/10.jpg)
Offline analysis ~ user clustering
Visitors ~ vectors of profile
attributes
hand-tuned clusters
decision-tree clustering
fuzzy k-means clustering
support vector machines
one big cluster ~ "best page"
growing neural gashierarchical clustering
![Page 11: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/11.jpg)
Offline analysis ~ user clustering
Visitors ~ vectors of profile
attributes
hand-tuned clusters
decision-tree clustering
fuzzy k-means clustering
support vector machines
one big cluster ~ "best page"
growing neural gashierarchical clustering
![Page 12: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/12.jpg)
Support vector machine example
Can we get one of the real data pages?
![Page 13: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/13.jpg)
This cluster of six people responds better to site B,
Page: AYield: 7 Page: A
Yield: 1
Page: AYield: 1
Page: BYield: 3
Page: BYield: 8
Page: BYield: 7
page A score ~ 3.0
page B score ~ 6.0
+7 1 1+
3 (visits)
+7 8 3+
3 (visits)
From clusters to sitesTraining data from each cluster determines the best site:
(yield)
(yield)
![Page 14: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/14.jpg)
Magnify360 wants to adapt quickly to new preferences:
but site A has had better recent performance.
Page: AYield: 7t: 0
Page: AYield: 1t: 3
Page: AYield: 1t: 4
Page: BYield: 3t: 1 Page: B
Yield: 8t: 5
Page: BYield: 7t: 4
page A score ~ 6.05
page B score ~ 3.68
+ +2-3 • 120 • 7 2-4 • 1
20 + 2-3 + 2-4
+ +2-5 • 82-4 • 7 2-1 • 3
2-4 + 2-5 + 2-1
t ~ age of data
Time-based site choice
Time-weighted average yields:
![Page 15: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/15.jpg)
procedure
Online classification
Possible results…
![Page 16: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/16.jpg)
all on one graph
Results ~ Packet 8
comments
what about hand-tuned system results?
![Page 17: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/17.jpg)
talk about SVM parameters here?
A closer look…
comments
![Page 18: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/18.jpg)
Sensitivity to scoring parameters?
comments
David's charts
![Page 19: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/19.jpg)
Software structure
comments
Diagram
What's done and not done…
![Page 20: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/20.jpg)
Software structure
comments
Diagram
What's done and not done…
![Page 21: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/21.jpg)
Perspective
Concluding comments
Questions?
![Page 22: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/22.jpg)
![Page 23: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/23.jpg)
Clickstream DataThe Good: We have DATA!
Too much?The Bad:
What is this data!?The Ugly:
~ 80 tables
~ 13 GB
![Page 24: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/24.jpg)
One of our tables…
![Page 25: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/25.jpg)
ID, anyone?
![Page 26: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/26.jpg)
Fun Statistics
![Page 27: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/27.jpg)
Data: To do
Understand the purpose of each table / column
Understand relationships between tables
Create a single table (or file) of relevant information in order to test and evaluate our clustering algorithms.
(table demodularization, against all design principles)
![Page 28: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/28.jpg)
Clustering Algorithmsk-Means: Choose centroids at random, and place points in cluster such that distances inside clusters are minimized. Recalculate centroids and repeat until a steady state is reached
Fuzzy k-Means: Similar, but every datapoint is in a cluster to some degree, not just in or out.
Heirarchical Clustering: Uses a bottom-up approach to bring together points and clusters that are close together
Bottom line: These clustering algorithms are simple and effective techniques for categorizing data, but they cannot exist in a vacuum; we are investigating other techniques that may be used in parallel or instead.
FuzME's best 10-cluster results ~ synthetic data
![Page 29: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/29.jpg)
Growing Neural Gas
A clustering algorithm masquerading as a neural network Given a data distribution, dynamically determines
nodes or “centroids” to represent the data
![Page 30: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/30.jpg)
Growing Neural Gas
A clustering algorithm masquerading as a neural network Given a data distribution, dynamically determines
nodes or “centroids” to represent the data
User Profiles
Representative Nodes
![Page 31: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/31.jpg)
Growing Neural Gas
A clustering algorithm masquerading as a neural network Given a data distribution, dynamically determines nodes
or “centroids” to represent the data
“Dynamic” because it adds or deletes nodes as necessary, as well as adapting nodes toward changes in the data.
User Profiles
Representative Nodes
![Page 32: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/32.jpg)
How it works…
Find the closest node, s, and the next closest, t. Update the error of s by εw|s – x| Shift s and its neighbors toward x, and increment
the age of all those edges. If s and t are adjacent, set the age of that edge to
0. Otherwise, create that edge. Remove edges that are too old, decrease the
error of all edges by a small amount. Add a node every generations, putting it between the
node with the largest error and its largest-error neighbor. Repeat!
Given some input x:
![Page 33: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/33.jpg)
A Few Parameters…
λ: Controls how frequently new nodes are inserted Max Edge Age: Dictates how often old edges are deleted εw: Factor to scale the value of the “winning” node εn: Factor to scale the value of the next nearest node α: Scale factor for decreasing the error of parent nodes β: Scale factor for decreasing error of all nodes
(Making sense of the GUI)
![Page 34: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/34.jpg)
… and the difference they make.
λ= 1000λ= 100
• Larger λ, nodes inserted less often• Takes longer, but yields more accurate placement of nodes
• Smaller λ, nodes inserted more often • Leaves straggler nodes that don’t accurately match data
![Page 35: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/35.jpg)
Support Vector Machines
![Page 36: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/36.jpg)
Clearly planar
![Page 37: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/37.jpg)
Planar in feature space
![Page 38: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/38.jpg)
Support Vector Regression (Machine?)Goal: Minimize error between hyper-plane and data points.
SVM SVR
Maximize cluster separation Minimize plane-to-data distance
![Page 39: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/39.jpg)
Getting the correct page…
What do we want from a technique?
Input: User data.Output: Page to serve.
Input: User data and possible page.Output: Predicted Success.
Both require multiple SVMs.
CLASSIFICATION:
REGRESSION:
![Page 40: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/40.jpg)
Using Classification via SVMs
Predicted Page:
CDATA
C
B
C
![Page 41: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/41.jpg)
Using Regression via SVRs
Page APredictor
Page BPredictor
Page CPredictor
0.42
0.24
0.78
Predicted Page:
CDATA
![Page 42: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/42.jpg)
DataThe Good: We have DATA!
Too much?The Bad:
What is this data!?The Ugly:
~ 80 tables
~ 13 GB
![Page 43: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/43.jpg)
One of our tables…
![Page 44: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/44.jpg)
ID, anyone?
![Page 45: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/45.jpg)
Fun Statistics
![Page 46: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/46.jpg)
Data: To do
Understand the purpose of each table / column
Understand relationships between tables
Create a single table (or file) of relevant information in order to test and evaluate our clustering algorithms.
(table demodularization, against all design principles)
![Page 47: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/47.jpg)
Goal Breakdown
![Page 48: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/48.jpg)
Short-term Plan
![Page 49: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/49.jpg)
Plan for Algorithm Comparison
![Page 50: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/50.jpg)
Plan for Algorithm Comparison
![Page 51: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/51.jpg)
Plan for Algorithm Comparison
![Page 52: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/52.jpg)
Schedule and Conclusion
Friday November 14 Prototype algorithm comparison method
Friday November 21 Initial testing on real data Meeting with Magnify360
Friday December 5 Initial composition of classification algorithms
Friday December 12 Midyear Report
Questions?
![Page 53: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/53.jpg)
Questions?
![Page 54: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/54.jpg)
SVM vs SVR
SVM SVR
Maximize Distance Minimize Distance
![Page 55: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/55.jpg)
Data
The Bad, or, The Challenges:
Lots of SQL data
![Page 56: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/56.jpg)
Some Data Tables
80 tables total…
![Page 57: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/57.jpg)
Data Size
![Page 58: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/58.jpg)
Problem StatementOfficially: Develop an innovative predictive modeling system to predict shopping cart abandonment based on profiles, clusters, shopping cart contents
Most importantly: GRAB from email ! Research and implement various AI techniques to optimize the process of matching users with websites
Individualized Online Experiences
![Page 59: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/59.jpg)
Classifying Users
Unsupervised clustering: points are clustered without knowledge of the results
Supervised clustering: clusters are built using prior knowledge of the results
Ethical concerns?
![Page 60: Optimizing Online Yield via Predictive Modeling of Individual Site Visitors Magnify360 Liasons: Olivier Chaine, Jim Healy, Nate Pool, Gilles ????? David.](https://reader035.fdocuments.us/reader035/viewer/2022062717/56649e215503460f94b0d82e/html5/thumbnails/60.jpg)
Recap: What Magnify360 Does
Individualize a website for different types of users
Collect data on users from their clickstream, and give them the site that will appeal to them best
Appeal to a larger base of users by making the site more interesting to a larger group
serving both!old Facebook