Spcua 2013 Alexey Kozhemiakin Enterprise Search
-
Upload
alex-kozhemiakin -
Category
Technology
-
view
104 -
download
1
description
Transcript of Spcua 2013 Alexey Kozhemiakin Enterprise Search
![Page 1: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/1.jpg)
May 22nd 2013, Kiev
Enterprise search portals SharePoint 2013
Alexey Kozhemiakin
![Page 2: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/2.jpg)
May 22nd 2013, Kiev
or “How to make a cool search”
Alexey Kozhemiakin
![Page 3: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/3.jpg)
3
Who’s speaking to you?
• Solution Architect @epam
• Focusing on search• Sharepoint Search FAST/2010/2013• Apache Lucene, Solr, elasticsearch,
Oracle Endeca…
• http://powersearching.wordpress.com
![Page 4: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/4.jpg)
4
Agenda
• Enterprise Search Portal• Insight into SP2013 Search• Key changes from SP2010• A bit of magic – relevancy calculation
• Search governance, useful hint & tips
![Page 5: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/5.jpg)
5
Key search patterns
• I know what I’m searching and where to find it
• I know what I’m searching but don’t know where to find it.
• I don’t‘ know what I’m searching
http://aghy.hu/AghyBlog_EN/Lists/Posts/Post.aspx?ID=199
![Page 6: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/6.jpg)
6
• Demand:• Fast growing enterprises• Zoo of internal systems
• Solution: • “google” inside enterprise
• Quick-wins for business:• Single point of smart search and information retrieval• Reduce search time by employee• Better inner communications and simplified reuse of
conent
Enterprise Search Portal
![Page 7: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/7.jpg)
7
But after deployment…
• «.. Search sucks»• Out of the box search knows nothing about you• «Typical But…• … Microsoft takes care of decent search algorithm»• … we’re not sure we can do better»• ... we don’t need search, everybody know where content is»• … make our search like in facebook/google/bing (instead of
requirements)»
![Page 8: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/8.jpg)
8
Why it’s hard
• Ambiguous short queries• Unstructured not optimized content• Different active vocabulary of content users and
creators• Limited resources ($), while in internet search:• Auto and manual testing of search quality (assessors)• Continuous improvement
![Page 9: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/9.jpg)
9
Search architecture in SP2013
![Page 10: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/10.jpg)
10
Search in two phase process
• Matching – all docs with keywords• Linguistics: stemming, phonetics• Synonyms
• Ranking• «Фичи»
• TF-IDF, BM25• Вес полей• Тип файла• Дата изменения• Популярность• …
![Page 11: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/11.jpg)
11
Ranking in FAST
• Linear combination of features
![Page 12: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/12.jpg)
12
Ranking in FAST
• Impact of each component to final rank
1st 2nd 3rd 4th0
1000
2000
3000
4000
5000
6000
7000
8000
term:fast term:search freshness static rank proximity
![Page 13: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/13.jpg)
13
Migration FAST->SP2013
![Page 14: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/14.jpg)
14
Ranking in SP2013
![Page 15: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/15.jpg)
15
Ranking in SP2013
• Default Relevancy Model• Two neural networks• Freshness in not included in ranking• Features Type Instance
BM25 BM25Static UrlDepthBucketedStatic InternalFileTypeBucketedStatic LanguageStatic ClickDistanceStatic QueryLogClicksStatic QueryLogSkipsStatic LastClicksStatic EventRateMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft Content
![Page 16: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/16.jpg)
16
Ranking in SP2013
• Default relevancy model
![Page 17: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/17.jpg)
17
Explain rank
• /_layout/15/explainrank.aspx• rankdetail property
![Page 18: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/18.jpg)
18
Explain rank
• Manual validation in excel
![Page 19: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/19.jpg)
19
![Page 20: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/20.jpg)
20
Search Governance
1. Search analytics2. Fine tuning and adaptation3. Regular testing4. Security assessment5. Promotion whithin company6. Content optimization and basic SEO
![Page 21: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/21.jpg)
21
1. Search analytics
• Search analytics• Search analytics• Search analytics
• Obey! Use Search analytics
![Page 22: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/22.jpg)
22
1. Search analytics
• OOTB in SP2013• Most popular queries• «No Results/abandoned» queries
• 3rd party tools (Google Analytics, Omniture, WebTrends)• Measure search quality (!)
• % click on results• Which results• Return after clicks
• Session analysis• Query segmantation
![Page 23: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/23.jpg)
23
Query segmantation
• Analyze and improve not only top N queries, but classes of queries
![Page 24: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/24.jpg)
24
2. Fine tuning
• Authoritative Pages• Quick win – content source priority
• Query Rules• Smart search for users
• Synonyms• Separate mapping file• Expansion only• Termsets synonyms NOT working
• Relevancy models
![Page 25: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/25.jpg)
25
Authoritative Pages
• Impacts ClickDistance• ClickDistance, UrlDepth have hich impact on total
score (see explain rank)• Configures in CA, CSOM
![Page 26: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/26.jpg)
26
Query Rules (Rule + Action)
• The tool to make search smarter• Interactive feedback to user queries• Post processing of queries• Leverage navigational queries• …
![Page 27: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/27.jpg)
27
Condition for Query Rules
• Query Matches Keyword Exactly• Advanced Query Text Match• Query Matches Dictionary Exactly
• Query Contains Action Term
• Query More Common in Source• Result Type Commonly Clicked
![Page 28: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/28.jpg)
28
Actions для Query Rules
• Create and display a result block• Change ranked search results• Best Bets• XRANK
• Works additive to total rank• Not explained in rankdetail• How to choose correct value?
![Page 29: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/29.jpg)
29
Templates for QueryRules
• Typical navigational keywords from our portal• Software, soft, download, install• How to• Policy, Blog• Portal• Music, Video• Presentation, Documents, Report• Training, tutorial• Book, ebook
• You will have different ones!
![Page 30: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/30.jpg)
30
Custom Rank Models
• Сбор Query Judgments• Tune neural network coefficients using machine
learning• Gradient Descent, Lambda Rank
• Microsoft.Office.Server.Search.RankerTuning
![Page 31: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/31.jpg)
31
Custom Rank Models
• Modify manually new model or very simple (not default one!)• A/B testing of weights• Measure, measure: Precision, NDCG
![Page 32: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/32.jpg)
32
Custom Rank Models
• Example of simple model – people search
![Page 33: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/33.jpg)
33
3. Search quality testing
• Why need? It’s your compass.• «Unit testing»• Periodical manual testing
![Page 34: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/34.jpg)
34
4. Security «audit»
• Search reveals breaches in security• Security by obscurity
• Examples of queries:• «confidential»• Salaries, performance reviews
• Solution – automatic monitoring of sensitive queries
![Page 35: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/35.jpg)
35
5. Adoption of content
• Use with departments• Get help with search monitoring of their queries
• Guideline to format content• Basic SEO• Titles• Friendly urls • Custom meta tags <meta name=…
• Title, description• Custom Automatically appear in crawled properties
![Page 36: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/36.jpg)
36
6. Promotion within company
• Image – «you will find everything here»• Integrate with other portals• Propose Search as a serivce• Widget «Global search»
• Badges, gamification
![Page 37: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/37.jpg)
37
Promotion
• Social Best-bets
![Page 38: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/38.jpg)
38
Semantic search
• Cannot be solved in general• Analytics + fine tuning• See practices above
• NLP – question answering• Rocket science• English only• Part of speech tagging, dependency parsing
• Stanford NLP, Open NLP, IR
![Page 39: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/39.jpg)
39
«References»
• Patents - http://goo.gl/20sbR
• Explain Rank page - http://goo.gl/o3ZmN
• How SP2013 relevancy models works - http://goo.gl/arf0P
• MS Enterprise Search approach - http://goo.gl/x8SDO
• Customizing ranking models in SP 2013 - http://goo.gl/lBJAp
![Page 40: Spcua 2013 Alexey Kozhemiakin Enterprise Search](https://reader034.fdocuments.us/reader034/viewer/2022051819/54c825a24a79596e068b458f/html5/thumbnails/40.jpg)
May 22nd 2013, Kiev
Thanks
Skype: Alexey_KozhemiakinEmail: [email protected]: http://powersearching.wordpress.com
40