How Lazada ranks products to improve customer experience and conversion
-
Upload
eugene-yan-ziyou -
Category
Data & Analytics
-
view
4.371 -
download
5
Transcript of How Lazada ranks products to improve customer experience and conversion
How Lazadaranks productsto improve customer experience and conversion
Strata Hadoop Singapore 2016
How Lazadaranks productsto improve customer experience and conversion
Strata Hadoop Singapore 2016
Leading e-commerce platform in South-East Asia
Lazada Data ScienceData App Devs expose, integrate, platform-ize
Data Scientists explore, prepare, model
Data Engineers collect, store, maintain
Start from bottom up
Ranking affects what appears on top
Ranking is different from recom-mendation
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
“How can I rank well on an e-commerce platform?”
Ranking products for catalog and search
Introducing new products
Emphasizing product quality
Web Tracker(JavaScript)
Mobile Tracker(Adjust)
3rd Party(e.g. ,ZenDesk, SurveyGizmo)
Kafka Queues
Bulk Loaders (Spark)
HadoopHadoop
Data Exploration
+Data
Preparation+
Feature Engineering
+Modelling
(Spark)
Manual Boosting(Django)
Local Validation
A/B Testing
Product
Seller
Transaction
Product rankings
Split traffic and measure outcomes
(Category Managers)
(User devices)
Overall resultsBetter ranking improved conversion and revenue per session
Introducing new products improved new product engagement
Emphasizing product quality had neutral to positive outcomes
Ranking products for catalog and search
IntentProvide shoppers quick access to best products in catalog/search results, making shopping easy
ProblemLazada has millions of products—not easy to navigate
How to identify products that interest users in the future?
How do we measure interest?
MethodologyMeasure shoppers’ interest through product engagement as a proxy
Clicks, add-to-cart, checkouts, etc.
Predict future interest
Collecting behavioral dataTrack and collect events on web (JavaScript) and app (Adjust)
Stream and process via Kafka
Store in Hive tables
Data preparationFilter and categorize online behavioral events (e.g., impressions, clicks, etc.)
Merge various views of product data (e.g. price, stock, etc.)
Exclude outliers and potentially fraudulent events
Feature engineeringCalculate product engagement metrics (e.g., average clicks, conversion rate, etc.)
Derive product attributes (e.g., age, discount, etc.)
Exclude outliers (e.g., conversion rate > 1.00)
Modelling (i.e., machine learning)Predict future (tomorrow’s) product clicks/checkouts
Examine results against a benchmark model
Pandas + XGBoost is faster and more effective than Spark + MLlib; assessing XGBoost4J-Spark
Boosting products (manually)Manually increase rank of certain products (e.g., highly anticipated products, campaign tie-ups)
User-friendly interface to drag-and-drop products
Limits on how many products can be boosted
Validation and A/B testingLocal validation is easy, but difficult to ensure similar results via A/B testing
A/B test all updates before production
ResultsIncreased conversion rate by 3 – 8%
Increased revenue per session by 5 – 20%
Introducing new products
IntentProvide potentially good new products with exposure
Provide shoppers with new products they like
Keep catalog fresh
ProblemProducts with strong engagement stay on top
Products without engagement don’t get traffic
How can we identify new products that are likely to interest users?
Methodology (demand)Find what people need
Measure needs through internal/external data
Rank new products in terms of demand
Methodology (supply)Find products similar to top products
Measure similarity with top products
Rank new products based on similarity and top product volume
Data preparation and feature engineeringParse (log) data to identify shoppers’ needs
Measure potential product demand
Model product similarity (Spark GraphX / ElasticSearch)
Validation and A/B testingLimited capability on existing A/B testing platforms to track specific products
Measure performance of new products across experimental groups using in-house tracker
ResultsIncreased new product click-thru rate by 30 – 80%
Increased new product add-to-cart by 20 – 90%
Expected overall conversion to decrease—increased instead (though not statistically significant)
Emphasizing product quality
IntentImprove customer experience throughout purchase journey
From online browsing to receiving of product
Product quality identified as key driver
ProblemHow do we measure product “quality”?
Methodology (online)Content (e.g., title quality, richness of content)
Reviews (e.g., average rating, negative reviews)
Performance (e.g., click-thru rate, browsing time)
Methodology (offline)Perfect order rate (i.e., not cancelled, not returned, etc.)
Negative feedback (e.g., counterfeit, complaints, etc.)
Seller metrics (e.g., timely shipped-rate, return rate, etc.)
Data preparation and feature engineeringDerive product features (e.g., title quality, image quality, etc.)
Measure content richness (e.g., attributes available, grouping, etc.)
Measure delivery performance and customer feedback
ResultsImproved quality of products displayed
Increased conversion by 3 – 5% for some countries
Small conversion change in other countries (non-significant)
Key takeawaysData science is (i) team sport, (ii) partly R&D, (iii) iterative
How you use data to solve problems (methodology), data preparation, and feature engineering > machine learning
Thank [email protected]