The dark art of search relevancy

34
The dark art of search relevancy

Transcript of The dark art of search relevancy

  1. 1. The dark art of search relevancy
  2. 2. Hi, Im Eddie. @ejlbell
  3. 3. We collect the world of fashion into a customisable shopping experience.
  4. 4. 90 thousand 3,800,000 items scraped per day. items updated / hour
  5. 5. 5
  6. 6. Why is search hard? 6
  7. 7. True NegativesFalse Negatives Selected Elements True Positives False Positives Relevant Elements Precision Recall How many selected items are relevant? How many selected items are relevant?
  8. 8. BM25 8
  9. 9. When it goes wrong Little Black Dress
  10. 10. When it goes wrong Red Valentino
  11. 11. BCBGMAXZRIA 11
  12. 12. Dress
  13. 13. Dress Dress Shirt
  14. 14. Dress Dress Shirt Shirt Dress
  15. 15. What is a good result?
  16. 16. Clickthrough Data
  17. 17. Crowdsource Relevance
  18. 18. Ordinal Light Pink Heels
  19. 19. Pairwise Blue Trainers
  20. 20. Pairwise Blue Trainers
  21. 21. Search Term Example 1.) designer + category hermes sandal 2.) designer + colour + category burberry black boots 3.) designer + fabric + category chloe leather top 4.) color + category gray hoodie 5.) designer + type asos bag What are people actually searching?
  22. 22. DSSM
  23. 23. DSSM with 1D convolution and max pooling C-DSSM
  24. 24. CSSM Results Search Result Score Download office Excel 0.54 Word office online 0.50 Apartment office hours 0.33 Internation office berklely 0.27 Microsoft Office Search Result Score Car body kits 0.70 Auto body parts 0.55 Calculate body fat 0.22 Forceeld body armour 0.17 Car body shop
  25. 25. Computing the results Store hidden layer representation in postgres Rank by cosine distance Speed up search with ANN Random projection trees Calculate query representation at run time Docker, Django-rest, chef, empire, auto-scaling
  26. 26. Images?
  27. 27. Images Train 8 layer CNN Swap out soft max layer A soft max layer for each label of interest 60 Million parameters Build robust learned representations Represents products as a 4096 element vector
  28. 28. Image Classifier Sub-category Score Male, clothing, suits, 3 piece suits 0.72263 Male, clothing, suits, 2 piece suits 0.12102 Male, clothing, jackets, formal jackets 0.10818 Male, clothing, coats, trench coats 0.02396 Male, clothing, jackets, casual jackets 0.00949 Colors Blue 0.99397 Gray 0.00481 Black 0.00071
  29. 29. Search is fun but search is hard Conclusion 33
  30. 30. thank you