ShareThis Tech Talk at DPS: Embracing Social: The New Formula for Publishers
Online Display Advertising Optimization with H2O at ShareThis
-
Upload
sri-ambati -
Category
Data & Analytics
-
view
574 -
download
0
Transcript of Online Display Advertising Optimization with H2O at ShareThis
Online Display Advertising
Optimization with H2O
Hassan NamarvarPrincipal Data Scientist
SF DATA MINING MEETUP
December 9th, 2014
2
OUTLINE
Introducing ShareThis
Online display advertising problem
Estimation of conversion rate using H2O
Results from live campaigns
Ongoing work
Q&A
SHARING TOOLS AT SCALE
23 Billion PAGE
VIEWS
120 SOCIAL
CHANNELS
1. comScore Media Matrix Report * Includes PC, Tablet, and Mobile sites.
210 MM US USERS1
95% REACH*
2.4 MM SITES AND
APPS
This is Missy! She is busy chatting
and browsing on the
web…
USER
Missy reads an article and
shares it to her Facebook
page using the ShareThis
widget
SOCIAL ACTIVITY
ShareThis observes the
share and can then target
Missy and her friends with
advertising messages
tailored to their interests
SOCIAL DATA
MAKING SOCIAL DATA ACTIONABLE
CATEGORY TARGETING: TECHNOLOGY
TVS
1.1 MM
AUDIO
800K
SMARTPHONES
13.7 MM
TABLETS
5.3 MM
PCs
6 MM
GAMING
7 MM
CAMERAS
1.3 MM
28.6 MM
USERS
35 MM+
SOCIAL ACTIONS
1.2 MM
SOCIAL ACTIONS/DAY
STANDARD TARGETING
THRESHOLD
INTE
REST
TIME
TRIGGER
EXCITEMENT
PEAK READINESS
FOR
ENGAGEMENT
FADING INTEREST
MALE 25-45 TECH ENTHUSIAST $HHI $75K+
“DAN”
6
ShareThis ONLY targets users within 24 hours to ensure ads reach them at the most
relevant moment
SHARETHIS MESSAGING TRIGGER
REAL TIME MESSAGING REACHES USERS
DURING PEAK INTEREST
7
ONLINE DISPLAY ADVERTISING
Advertisers’ goal is to target the most receptive online audience
in the right context and right time, so that to influence users to
engage with the ad.
Publisher Web Page
Ad Ad Exchange
Model Pipeline(Production)
Real Time Bidding (RTB)
System
ShareThis Data
Campaign DataMeta Data
Models
8
ONLINE DISPLAY ADVERTISING
Campaign Performance
Advertisers seek the optimal price to bid for each ad call.
Cost per Click (CPC) Model
Cost per Action (CPA) Model
9
MODELING CONVERSION RATE (CVR)
CTR and CVR are directly related to the user interacting with the
ad in a given context.
Challenge
They are fundamentally difficult to directly model and predict.
Even CVR is harder than CTR since conversion are very rare
events
View-through conversions have longer delays in the logging
system.
10
PROBLEM SETUP
Let define Users, Publishers, Ads, Devices, and Locations as:
Goal
Find the optimal ad such that the probability of conversion is the
highest.
11
PROBLEM SETUP
At single user level, the problem is a binary problem: conversion
or no conversion.
Conversion event is a random binary event
Transactional (low-level) data features are poorly correlated with
user’s direct response on a display ad.
12
DATA HIERARCHIES
A2
A1
A0 Root
Advertiser1
Campaign 1
Campaign 2
Advertiser2
Campaign 3
Campaign K
L2
L1
L0 Root
Location 1
Zipcode 1Zipcode 2
Location 2
Zipcode 3Zipcode
N
U2
U1
U0 Root
UserClust 1
UserGroup1
UserGroup2
UserClust 2
UserGroup3
UserGroupI
P2
P1
P0 Root
PubType1
Publisher 1
Publisher 2
PubType2
Publisher 3
Publisher J
13
HIGH LEVEL MODELING
Compute conversions for similar users, contexts, ads, …
Maximum Likelihood Estimate (MLE):
14
COMBINING EESTIMATORS
LOGISTIC REGRESSION
Let denote MLE of the CVR’s of events at Q
different levels.
Goal
Estimate CVR using combination of estimators:
Log-likelihood
Logistic Regression
15
PRACTICAL ISSUES
Data Imbalance
CVR is inherently very low
Need to up-sample conversions or down-sample non conversions
Remove Anomalies
Retargeting visit data as proxy for cnv when cnv data is not available
Remove outliers
Missing Features
Sometimes features are missing or not enough conversions
Impute features
Feature Selection
Discard feature if more than 70% of the training examples are missing
Variance of attribution is lower than a threshold (10e-9)
16
WHY NEW MACHINE LEARNING TOOL?
Available large-scale ML tools such as Apache Mahout, Vowpal Wabbit, HadoopRMR, native Spark MLLib have their own issues.
Critical Features for a state-of-the-art ML package:
Ease of use
System reliability
In-memory (fast)
Distributed
Extensible (API/SDK)
Accurate algorithms
Visualization (data and results)
Easy to deploy to production
17
H2O PLATFORM
Screen shot for H2O platform web API
18
H2O PLATFORM: GLM MODEL
Screen shot for the CPA model using the GLM algorithm.
19
SCORE CALIBRATION
Calibrate Model Scores
Find best threshold from AUC
Ad server attributes a conversion to the last impression
RTB needs to deliver certain amount of impressions per day
There is a trade-off between wasting impressions and winning
conversions.
20
BUILDING A CPA MODEL RETARGETED VISITS AS A PROXY FOR CONVERSIONS
USER-CENTRIC
Focus on RT Users
Deliver Ads at the optimal
times
BETTERPERFORMANCE
Leverage optimization
opportunities
OPTIMAL TIME
Target Users Who Likely
Convert
DON’T WASTE IMP.
21
LIVE TEST ON A CAR INSURANCE CAMPAIGNTESTED FOR TWO MONTHS AND MEASURED THE PERFORMANCE BY DFA.
The CPA test for a car insurance campaign showed 58% improvement on
eCPA and 57% on conversion rate (CVR).
22
LIVE TESTS ON DIFFERENT CAMPAIGNSOBSERVED CPA LIFT
23
ONGOING WORK
Tests are expensive and time consuming
We need to evaluate models before deploying to production
Build many models and evaluate them offline
Different datasets
Different features
Different algorithms
24
COMBINING ESTIMATORS
GRADIENT BOOSTING MACHINE
Let denote categorical features.
Goal
Estimate CVR using an ensemble of weak prediction models,
decision trees:
Gradient boosting combines weak learners into a single strong
learner, in an iterative fashion.
25
MODEL COMPARISON
Comparing AUC plots of GBM and RF models on test data:
26
OFFLINE SIMULATIONS
Comparing AUC plots of GBM and RF models on test data:
27
OFFLINE SIMULATIONS
Selecting models in practice
Accuracy of prediction on unseen data
Scoring time at production
Remove anomalies using Deep Learning
Correlations with other campaign KPIs (CTR, Brand lift,
Viewability, Winning Price, …)
Performance Stability
28
EVALUATION ON IMPRESSION DATA
Correlation of GBM model scores with CTR
29
EVALUATION ON IMPRESSION DATA
Correlation of GBM model scores with average winning bid price
30
GBM MODEL TESTS vs GLM MODEL CONTROLA/B TEST: OBSERVED CPA LIFT
31
CONCLUSION
How H2O helped us?
Maximized ROI by optimizing campaign performance and
budget allocation.
Empowered advanced ML algorithms in Hadoop cluster
Used all data and build models much faster
Reduced R&D time significantly
Building a smooth model building pipeline (R and Spark API)
ACKNOWLEDGEMENT
THE TEAM:
Prasanta Behera
Xibin Chen
Wahid Chrabakh
Jinghao Miao
Hassan Namarvar
Yan Qu
THANK YOU!
SHARETHIS IS HIRING!
Please check out:
www.sharethis.com/about/careers
Q&A