Semantic Search
description
Transcript of Semantic Search
![Page 1: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/1.jpg)
1
Semantic SearchAndisheh Keikha
Ryerson University
Ebrahim BagheriRyerson University
May 7th 2014
![Page 2: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/2.jpg)
2
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
![Page 3: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/3.jpg)
3
Search Process Simple search
Query: keywords Find documents which have those keywords Rank them based on query Result: ranked documents
![Page 4: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/4.jpg)
4
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
![Page 5: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/5.jpg)
5
Query Processing Query length
Correlated with performance in the search task Query is small collection of keywords Hard to find relevant documents only
based on 2,3 words Solution
Query reformulation Query expansion
![Page 6: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/6.jpg)
6
Query Processing Query Expansion
Selection of new terms
Relevant documentsWordNet (Synonym, hyponym, …)
…Disambiguation
![Page 7: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/7.jpg)
7
Query Processing Query Expansion
Selection of new terms Weighting those terms
![Page 8: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/8.jpg)
8
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
![Page 9: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/9.jpg)
9
Document Ranking Probabilistic Methods
What is the probability that this document is relevant to this query?
𝑃 (𝐿|𝐷 )=𝑃 (𝐷|𝐿 )𝑃 (𝐿)
𝑃 (𝐷)
The event that the document is
judged as relevant to query
The document description
![Page 10: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/10.jpg)
10
Document Ranking Language Models
What is the probability of generating query Q, given document d, with language model Md.
𝑃 (𝑄|𝑀𝑑 )=∏𝑡∈𝑄
�̂�𝑚𝑙 (𝑡 ,𝑑)
Maximum likelihood
estimate of the probability
![Page 11: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/11.jpg)
11
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
![Page 12: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/12.jpg)
12
Search Result Clustering and Diversification
![Page 13: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/13.jpg)
13
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
![Page 14: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/14.jpg)
14
What is the Goal Searching on google
![Page 15: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/15.jpg)
15
What is the Goal Searching on google
I want all of these searches show the same results, since they have same meaning, and
it is the intent of the user to know all of them, when searching for one.
![Page 16: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/16.jpg)
16
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
Query Expansion Query Expansion(Tasks to Decide) Document Ranking
![Page 17: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/17.jpg)
17
Contributions How?
New Semantic Query Expansion Method New Semantic Document Ranking Method
![Page 18: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/18.jpg)
18
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
Query Expansion Query Expansion(Tasks to Decide) Document Ranking
![Page 19: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/19.jpg)
19
Query Expansion Example: “Gain Weight” Desirable keywords in expanded query:
“Gain, weight, muscle, mass, fat”
Gain weight
Muscle
Mass
Fat
What are these relations?
![Page 20: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/20.jpg)
20
Query Expansion Digging in dbpedia and wikipedia
http://en.wikipedia.org/wiki/Weight_gain
http://dbpedia.org/page/Muscle http://dbpedia.org/page/Adipose_tissue
![Page 21: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/21.jpg)
21
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
Query Expansion Query Expansion(Tasks to Decide) Document Ranking
![Page 22: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/22.jpg)
22
Query Expansion(Tasks to Decide)
How to map query phrases into Wikipedia components?
Which properties and their related entitles should be selected?
Can those properties be selected automatically for each phrase? Or should it be fixed for the whole algorithm?
If it’s automatic, what is the process?
![Page 23: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/23.jpg)
23
Query Expansion(Tasks to Decide)
Is dbpedia and Wikipedia enough to decide, or should we use other ontologies?
How should we weight the extracted entities (terms, senses) in order to select the expanded query among them.
![Page 24: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/24.jpg)
24
Outline Search Process Query Processing Document Ranking Search Result Clustering and
Diversification What is the Goal Contributions
Query Expansion Query Expansion(Tasks to Decide) Document Ranking
![Page 25: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/25.jpg)
25
Document Ranking Are the documents annotated?
Yes• Rank documents using the extracted entitles from the
query expansion phase. No
• Rank the documents based on the semantics of the expanded query other than the terms or phrases.
• Define probabilities over senses other than terms in the query and documents.
![Page 26: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/26.jpg)
26
Document Ranking Are the documents annotated?
Yes• Rank documents using the extracted entitles from the
query expansion phase. No
• Rank the documents based on the semantics of the expanded query other than the terms or phrases.
• Define probabilities over senses other than terms in the query and documents.
Documents are not annotated, so how?
![Page 27: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/27.jpg)
27
Document Ranking Semantic Similarity between two non-
annotated documents ( the expanded query and the document) There are papers on using WordNet ontology,
with “topic specific PageRank algorithm”, for similarity of two sentences (phrase or word).
The application on information retrieval has not been seen yet.
![Page 28: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/28.jpg)
28
Document Ranking Semantic Similarity between two non-
annotated documents ( the expanded query and the document) There are papers on using WordNet ontology,
with “topic specific PageRank algorithm”, for similarity of two sentences (phrase or word).
The application on information retrieval has not been seen yet.
Find the aspects of different algorithms which are more
beneficial in the information retrieval domain (two large
documents)
![Page 29: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/29.jpg)
29
Document Ranking Semantic Similarity between two non-
annotated documents ( the expanded query and the document) There are papers on using WordNet ontology,
with “topic specific PageRank algorithm”, for similarity of two sentences (phrase or word).
The application on information retrieval has not been seen yet.
More reasonable is to apply the algorithm on dbpedia (instead of WordNet) in the entity domain
(instead of sense domain)
![Page 30: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/30.jpg)
30
Document Ranking Applying a search result clustering and
diversification, based on the different semantics of the query.
![Page 31: Semantic Search](https://reader034.fdocuments.us/reader034/viewer/2022051317/56815f42550346895dce1cbd/html5/thumbnails/31.jpg)
31
Reference 1. B. Selvaretnam, M. B. (2011). Natural language technology and query expansion: issues,
state-of-the-art and perspectives. Journal of Intelligent Information Systems, 38(3), 709-740. 2. C. Carpineto, G. R. (2012). A Survey of Automatic Query Expansion in Information
Retrieval. ACM Computing Surveys, 44(1), 1-50. 3. Hiemstra, Djoerd. "A linguistically motivated probabilistic model of information retrieval."
In Research and advanced technology for digital libraries, pp. 569-584. Springer Berlin Heidelberg, 1998.
4. S. W. S. R. K. Sparck Jones, "A probabilistic model of information retrieval : development and comparative experiments Part 1," Information Processing & Management, vol. 36, no. 6, pp. 779-808, 2000.
5. Sparck Jones, Karen, Steve Walker, and Stephen E. Robertson. "A probabilistic model of information retrieval: development and comparative experiments: Part 2." Information Processing & Management 36.6 (2000): 809-840.
6. a. R. N. A. Di Marco, "Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction," Computational Linguistics, vol. 39, no. 3, pp. 709-754, 2013.
7. Di Marco, Antonio, and Roberto Navigli. "Clustering and diversifying web search results with graph-based word sense induction." Computational Linguistics 39, no. 3 (2013): 709-754.
8. Pilehvar, Mohammad Taher, David Jurgens, and Roberto Navigli. "Align, disambiguate and walk: A unified approach for measuring semantic similarity." InProceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013). 2013.