Download - Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Page 1: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Predicting Click Through Rate for Job Listings

Manish Gupta

Yahoo! HotJobs

Jan 22, 2009

Page 2: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 3: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR and its applications

• CTR = Ratio of clicks to get full description of entity to views of a reduced version

• Rank results• Impacts publisher revenue in pay for perf

models• Bidding in ad exchanges• Trends can help detect click frauds

Page 4: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 5: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 6: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR for new job listings

• Avg CTR = 2.29%• MLE would have high variance

Page 7: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR for job listings

Page 8: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Related work• Regelson and Fain – Estimate CTR using topic clusters (job categories)

• Richardson et. al.– Describe features for predicting CTR for ads.

• Our baseline: avg CTR for a test job (2.29%)

Page 9: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Refined Problem definition

• Ideal: Predict CTR(job j, position p, user cluster u, context c)

Data sparsity Huge feature vector• Predict CTR(job)

Use CTR versus position curve• Predict CTR(job, position)

Page 10: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Data set

• Used HotJobs data from Aug 11, 2008 to Aug 31, 2008 to predict CTR of jobs on Sep 1, 2008

• 40K jobs from 7k+ companies• 32K train set and 8K as test set• Jobs have location, company name, category,

creation date, posting date, optional position wise click history, job source, title, snippet & job description.

Page 11: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Different models

• Weka: Linear Regression and SMOReg• Treenet: Gradient Boosted Decision Trees

• Feature selection:– Weka: wrapper with evaluator=linear regression

and search=GreedyStepwise– Treenet: Variable importance metrics

Page 12: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.


• Features from Similar Jobs (60)– CTR of jobs with same

title/company/state/city+state/category and their cardinalities posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks

• Features from Related Jobs (288) – CTR_mn of related jobs with m= |A-B| and

n=|B-A| and cardinalities (0 ≤m,n≤ 5) posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks

Page 13: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.


• Job Title Features (11)– #words, #capitalized words, isAllCaps, hasHighPunct,

hasLongWords, hasNumbers, vocabulory features• Daily CTR Features for past 3 weeks (21)• Other Features (10)– Job Category, age, location specificity, job source, and

job description page features• Other potential features– high-marketing-pitch words, brand value of company,

spam feedback, seasonal variations

Page 14: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Experiments and results• Baseline: Predict avg CTR for a test job (2.29%)• Predicting avg - category-wise – CTR (A)• Linear Regression over 390 features (B) – uses only 142 regressors.• GBDT using Treenet over 390 features (C) – uses 300 regressors. (at


Page 15: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Analysis of regressor distribution

Page 16: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Important features

• Similar Jobs features– Same company, title, city+state using 1 week click

history• Others features– Creation date, job description page size, date of

update, posting date, job category• Related Jobs features– Related_11, related_12 jobs posted in past 1/3

weeks over 1/3 week click history

Page 17: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

Page 18: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

• Wrapper based feature selection with linear regression and with Treenet’s variable importance (E) -11 features.

Page 19: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

In absence of click history …

• Linear regression with 369 features (F) – uses 187 regressors.

• Treenet uses 282 regressors at 256_600_0.01_20 (G)

Page 20: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Analysis of regressor distribution

None of the sets alone helps!

Page 21: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

Page 22: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Variable importance curves

Page 23: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Conclusion and future work• More features• Dyadic models to predict user-personalized CTR with

(job feature vector, user feature vector) dyads.• Auto model updates to correct model drift

• We built a machine learning system to predict CTR for job listings and presented our results using various regression metrics.

Page 24: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Thanks for your time