JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and...
Transcript of JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and...
![Page 1: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/1.jpg)
#RampUp20
RAMPUP FOR DEVELOPERS
The Systems Behind Lookalike Modeling
JOE HSY@LIVERAMP
Head of Engineering - Data Science and
Innovation
LiveRamp
OPE BANWO@LIVERAMP
Software Engineer, Data Science
LiveRamp
![Page 2: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/2.jpg)
LiveRamp Lookalike Modeling API (LLAMA)Joe Hsy, Opeoluwa Banwo
![Page 3: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/3.jpg)
What is Lookalike Modeling?
Display
Social
Programmitic
Mobile
Video/TV
Start with a known seed audience segment1
Expand the segment with Look-Alike audiences2
Activate to multiple destinations3
![Page 4: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/4.jpg)
Goals of LLAMA
Self-service controls facilitate optimization of modeled audiences for Reach or Similarity.
Adjustable
Seed and modeled audiences are matched deterministically to IdentityLinks, to provide more precise targeting.
Accurate
With IdentityLink, both online and offline data can be ingested and modeled out to different channeltypes for activation.
Flexible
Use LiveRamp’s licensed reference data set from the LiveRamp Data Store, or bring your own data (BYOD).
Customizable
![Page 5: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/5.jpg)
LiveRamp’s Approach to Building Data Science Products
![Page 6: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/6.jpg)
Early Days of Data Science
Data Science Application Developers
![Page 7: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/7.jpg)
LiveRamp’s Approach
Data Science+ Software Engineering
Application DevelopersRest API
![Page 8: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/8.jpg)
Four Pillars of Data Science Productization
Create a plug-and-play ecosystem with reliable and secure APIs.
APIs, APIs, APIs
Automate the data engineering layer with scalable data pipelines of data onboarding and model building.
Automate the Data Engineering Process
Design the architecture to accommodates a wide range of machine learning model classes so we can continually evaluate new approaches
Plug-n-play Data Science Models
Enforce a software development life cycle of continuous improvement through reliable code and a robust CI/CD process
Built-in Reliability Engineering
![Page 9: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/9.jpg)
LLAMA Data Science
![Page 10: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/10.jpg)
Challenges of Merging Reference Datasets
Dataset A Overlapping columnsmust standardize data types and values
Must impute missing values here
Overlapping recordscan use these for missingdata imputation models
Must impute missing values here
Dataset B
Must resolve valueconflicts here
![Page 11: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/11.jpg)
Model Building Approach
We currently use
ridge regression with cross
validation for hyperparameter
tuning
• This enables variable selection as we often handle 1000+ variables.
• We also sample the data as ridge regression has a fairly limited capacity - we found 30k positive examples to be the best number.
We continuously evaluate
other approaches:
• Random forests (too slow)
• Boosted trees (overfit easily)
• Neural networks (didn’t offer meaningful improvements over ridge)
We aim for the simplest
model class that performs
well
![Page 12: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/12.jpg)
LLAMA Architecture
![Page 13: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/13.jpg)
LLAMA System Workflow
Advice Sheet
Seeded Dataset
Sample Transformation Pipeline
Sample for Train/Test
FeatureEngineering
Model Training
Predict Pipeline
Transformed Training / Test Set
Model Artefacts
Scores
Percentiles
Raw Dataset 2 Advice Sheet
Advice SheetProfile
Pipeline
Onboarding Pipeline
Prepare Connected Components
Resolve
Onboarded Data
Deduplication By-Pass
Raw Dataset 1
Profile Pipeline
Profile Pipeline
Transformation Artefacts
(optional other data sources)
![Page 14: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/14.jpg)
Automatically determine unique, continuous,
categorical, dummy, etc.
Automatically detect data types including generic data types as
well as IDL-s, etc.
Automatically detect one-hot-encoded
features and sparse features
Generate automated, editable transform
policies for onboarding
ProfilingLLAMA uses a robust profiler that provides full insight into the features, content, and quality of a datasets
Report distributions and levels depending on the type of feature
![Page 15: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/15.jpg)
Onboarding
Process a dataset based on editable policy produced at the profiling stage
Onboarding can be based on a single or multiple profiled datasets
This crucial capability enables Llama to compile multiple spines into the most rich-featured reference datasets in the market
• Reverse one-hot-encode features that were one-hot-encoded
• Drop sparse or irrelevant features
• Resolve many-to-many relationships between row ID-s and IDL-s
![Page 16: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/16.jpg)
Sample & Transform
Computes Training & Test
sets based on client’s seed dataset
Transforms Training & Test
sets based on custom defined transform logic
Persists transformed Training & Test
sets andtransform logic
![Page 17: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/17.jpg)
Predict
Computes the predicted probability for instances
of the onboarded dataset
Computes percentile limits based on a sample of the predicted scores
Co-locates instances belonging to the same percentile
![Page 18: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/18.jpg)
Technology Stack
Cloud dataflow • Highly scalable mapreduce solutions with near zero infrastructure setup
Tensorflow [transform]• Ecosystem of tools to build and train ml models efficiently
• Seamless integration with dataflow for doing feature engineering
Cloud Composer• Management and monitoring of end to end workflow
BigQuery• Efficient and cheap solution for doing CCPA/GDPR compliance operations
![Page 19: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/19.jpg)
Example Use Case
![Page 20: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/20.jpg)
Create a new campaign.
![Page 21: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/21.jpg)
Request Lookalike
![Page 22: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/22.jpg)
Choose Reach vs Accuracy Using Slider
![Page 23: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/23.jpg)
Activate Lookalike Audiences
![Page 24: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/24.jpg)
Future Directions
External API
• Enable partners and customers to programmatically iterate quickly and create many audiences at different cuts of reach/similarity
• Looking for beta customers!
Enhance Audience Expansion Flexibility
• Current lookalike support binary classes only - Audience members are ranked on their similarity to the seed
• Multiclass classifiers allows for audience expansion across different attributes, such as low, medium, high loyalty
![Page 25: JOE HSY OPE BANWO...Lookalike Modeling JOE HSY @LIVERAMP Head of Engineering - Data Science and Innovation LiveRamp OPE BANWO @LIVERAMP Software Engineer, Data Science LiveRamp LiveRamp](https://reader033.fdocuments.us/reader033/viewer/2022042612/5f5784792f75f60005253f24/html5/thumbnails/25.jpg)
2020 LiveRamp. All rights reserved.
Questions?