Gravitate d3.1 report on shape analysis and matching and on semantic matching
Semantic Matching using Neural Networks
Transcript of Semantic Matching using Neural Networks
Semantic Matching using Neural Networks
Information Retrieval
CSE, IIT Kharagpur
March 20th, 2020
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 1 / 21
Semantic Matching
Definition“.. conduct query/document analysis to represent the meanings ofquery/document with richer representations and then perform matching withthe representations.”
i.e., go beyong keyword (lexical) matching.We will discuss both unsupervised and supervised methods of semanticmatching.
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 2 / 21
- -
idt LM dry
-
(pseudoT-
Relevance Beyond keyword maths?Feedback → Query Expansion-
- - -
Semantic Matching: What have we seen till now?
Query expansion
Relevance Feedback
Translation Model (How to model word similarity?)
Disrtributional HypothesisWords that occur in similar contexts tend to have similar meanings.
Word embeddings have proved to be very important for modeling semanticsimilarity
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 3 / 21
-=-3
Semantic Matching: What have we seen till now?
Query expansion
Relevance Feedback
Translation Model (How to model word similarity?)
Disrtributional HypothesisWords that occur in similar contexts tend to have similar meanings.
Word embeddings have proved to be very important for modeling semanticsimilarity
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 3 / 21
-
word similarity
DitMdels
Semantic Matching: What have we seen till now?
Query expansion
Relevance Feedback
Translation Model (How to model word similarity?)
Disrtributional HypothesisWords that occur in similar contexts tend to have similar meanings.
Word embeddings have proved to be very important for modeling semanticsimilarity
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 3 / 21
= --
Word2Vec – A distributed representation
Distributional representation – word embedding?Any word wi in the corpus is given a distributional representation by anembedding
wi 2 Rd
i.e., a d�dimensional vector, which is mostly learnt!
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 4 / 21
-
\
without anyannotations
Word2Vec – A distributed representation
Distributional representation – word embedding?Any word wi in the corpus is given a distributional representation by anembedding
wi 2 Rd
i.e., a d�dimensional vector, which is mostly learnt!
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 4 / 21
→ 8 -dim
Two Variations: CBOW and Skip-grams
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 5 / 21
- d
-d
- -
What do we finally have?
For each word wi in vocabulary (size V), we have two vectors: vINi and
vOUTi , each of d�dimensions.
Generally, you can just add these vectors and use vi = vINi + vOUT
i
Ideally, similar words will have similar vectors
How do we go about using these for the retrieval task
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 6 / 21
- -
-
--
- -
-
cosine simty
Query Expansion
Pre-trained word embeddings for query expansion
Basic IdeaIdentify expansion terms using word2Vec cosine similaity
Pre-retrieval: Taking nearest neighbors of query terms as the expansionterms
Post-retrieval: Using a set of pseudo-relevant documents to restrict thesearch domain for the candidate expansion terms.
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 7 / 21
--
I#inters.ee#q-tfuisdmt-TfEtops-
Neural Translation Language Model
Language Model: Using Query LikelihoodP(q|d) = ’tq2q p(tq|d)
What happens in translation language model
p(tq|d) = Âtd2d p(tq|td)p(td|d)
You can use similarity between term embeddings for term-term translationprobability, thus
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 8 / 21
(type mismatch
q . songs d : hits←
Neural Translation Language Model
Language Model: Using Query LikelihoodP(q|d) = ’tq2q p(tq|d)
What happens in translation language modelp(tq|d) = Âtd2d p(tq|td)p(td|d)
You can use similarity between term embeddings for term-term translationprobability, thus
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 8 / 21
✓
-#I⇐ Petey
Dual Embedding Space Model (DESM)
Nalisnick et al., 2016. Improving Document Ranking with Dual WordEmbeddings. (WWW ’16 Companion).
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 9 / 21
Ss
DualEmbeddedSogndal
-
win . vous#
gimenez IEEE -
Dual Embedding Space Model (DESM)
IN-IN and OUT-OUT cosine similarities are high for words that are similarby function or type (typical) and the
IN-OUT cosine similarities are high between words that often co-occur inthe same query or document (topical).
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 10 / 21
←BTP J ER system gin d.out
O_0dn
I I
=L universities /Names-
-
Pre-trained word embeddings for document retrieval
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 11 / 21
a. ① -
-
I an
OUT- OUT IN- for
How do you evaluate this?
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 12 / 21
→-_ 7.5K¥→I-
: Top 15 doc.-
- --- returned by
- I 12 system-
② ff_idfu-
-- - tf~
Vabsets
Results: Reranking k-best list
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 13 / 21
→-t
-
- ⇒
Results: whole ranking system
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 14 / 21
- - -
→
-
⇐IUnsupervised
Semantic Matching – with Supervision
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 15 / 21
learn how ←9 dd! Freeto represent &a
Tri -
-
-I-
-
I a -
=- =
=-
DSSM
Why supervised?We learn to represent queries and documents in the latent vector space byforcing the vector representations
for relevant query-document pairs (q,d+) to be close in the latent space;and
for irrelevant query-document pairs (q,d�) to be far in the latent vectorspace
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 16 / 21
-
--
=-
Understanding DSSM - How to represent text
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 17 / 21
5%
t - - gook dim
>I
31
Understanding DSSM - Architecture
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 18 / 21
>1dg -- 3
4 30 I 2
# ok
-
%e5.gg .
.
.
! me ...
*⇒ disease:we
DSSM - Training Objective
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 19 / 21
Evaluation Details
16,510 English queries sampled from one year query log files of Bing
Each query is associated with 15 web document titles
Relevance judgement on a scale of 0 to 4
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 20 / 21
-
-
DSSM - Results
Information Retrieval (IIT Kharagpur) Semantic Matching using Neural Networks March 20th, 2020 21 / 21
u