1Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Expertise Search @
Viet Ha-Thuc, Ganesh Venkataraman, Mario Rodriguez, Shakti Sinha, Senthil Sundaram and Lin Guo
Viet Ha-Thuc
2
• 200+ countries and territories
• 2+ new members per second
3
4
Talent SolutionsHelp recruiters and companies to search for the right talent with their desired expertise
5
Agenda
Introduction
Skill Reputation Scores
Personalized Learning-to-Rank
Results & Lessons
6
Introduction
Skills– 40K+ standardized skills– Members get endorsed on
skills– Represent professional
expertise
7
Introduction Expertise search on LinkedIn
– Skill and no personal name
8
Introduction Unique challenges to LinkedIn expertise Search
– Scale: 400M members x 40K standardized skills
– Sparsity of skills in profiles
– Personalization
…
9
Agenda
Introduction
Skill Reputation Scores
Personalized Learning-to-Rank
Results & Lessons
10
ReputationInformation a decision maker uses to make a
judgment on an entity with a record (*)
(*) “Building web reputation systems”, Glass and Farmer, 2010
11
Skill Reputation Scores
Decision Maker: searcher
Record: Professional career
Skill reputation: member expertise on a skill
Judgment: Hire?
12
Estimating Skill Reputation
Endorse profile
browsemap
? .85 .45? ? .35
? .42 ?
? ? .05Mem
bers
Skills
P(expert| member, skill)
Supervised Learning algorithm
13
Estimating Skill Reputation
Endorse profile
browsemap
? .85 .45
? ? .35
? .42 ?
? ? .05Mem
bers
Skills0.5 1
0.7 0
0 0.6
0.1 0
0.2 0.3 0.5
0.5 0.7 0.2
Mem
bers
Skills
Each row is a representation of a member in latent space
Each column represents a skill in
latent space
Matrix Factorization
14
Estimating Skill Reputation
Endorse profile
browsemap
? .85 .45
? ? .35
? .42 ?
.02 ? ?Mem
bers
Skills0.5 1
0.7 0
0 0.6
0.1 0
0.2 0.3 0.5
0.5 0.7 0.2
Mem
bers
Skills
.6 .85 .45
.14 .21 .35
.3 .42 .12
.02 .03 .05Mem
bers
SkillsFill in unknown cells in
the original matrix
15
Matrix Factorization
Matrix factorization by Alternating Least Squares optimization
? .85 .45
? ? .35
? .42 ?
.02 ? ?Mem
bers
Skills0.5 1
0.7 0
0 0.6
0.1 0Mem
bers
Skills
?
R M S
Si+1 = ArgminS ||R – Mi.S)||2
16
Matrix Factorization
Matrix factorization by Alternating Least Squares optimization
? .85 .45
? ? .35
? .42 ?
.02 ? ?Mem
bers
Skills0.5 1
0.7 0
0 0.6
0.1 0
0.2 0.3 0.5
0.5 0.7 0.2
Mem
bers
Skills
R M S
Mi+1 = ArgminM ||R – M.Si+1||2
?
17
Matrix Factorization
Matrix factorization by Alternating Least Squares optimization– Apache Mahout
Take skill co-occurrence patterns to infer missing skills– Members knowing “Big Data” are also likely to know “Hadoop”
18
Skill Reputation Feature
Project a query into latent space: Q = sj + sk
Reputation = mi . (sj+sk) = mi.sj + mi.sk
Efficiency: Pre-compute and index member-skill scores mi.sjSSkills
sj sk
Mem
bers
M
mi
19
Features Reputation feature
Social Connection
Homophily– Geo– Industry
Textual Features
20
Agenda
Introduction
Skill Reputation Scores
Personalized Learning-to-Rank
Results & Lessons
Ranking
▪ Manually tuning vs. Learning to Rank (LTR)
▪ Why Learning to Rank?– Hard to manually tune with very large number of features– Challenging to personalize– LTR allows leveraging large volume of click data in an
automated way
21
22
Training Data: click logs Top-K randomization
Uncertain (removed)
Bad: label = 0
Good: label = 1click
InMail Perfect: label = 3
23
Learning to Rank
Coordinate Ascent Listwise
– Consider relevance is relative to every query– Allow optimizing quality metric directly
Objective function– Normalized Discounted Cumulative Gain (NDCG@K)– Graded relevance labels
24
Agenda
Introduction
Skill Reputation Scores
Personalized Learning-to-Rank
Results & Lessons
25
Experiments Query Tagging
Target Segment: skill and no-name Baseline
– No skill reputation feature– Hand-tuned
Search Products: Flagship and premium A/B Tests for 4 weeks
– Novelty effect: ignore 1st week– Size: hundreds of thousand searches
26
Results
CTR@10 # Messages per Search
Flagship +11% +20%
Premium +18% +37%
Improvements over the baseline
27
Take-Aways Going beyond text features
– Exploit structured data
Matrix factorization for large-scale reputation estimation – 400M members x 40K skills
Personalized Learning-to-Rank is crucial
Full Paper: http://arxiv.org/pdf/1602.04572v1.pdf
28
We are hiring!email: [email protected]
Top Related