Multi-Relational Latent Semantic Analysis .
description
Transcript of Multi-Relational Latent Semantic Analysis .
![Page 1: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/1.jpg)
Multi-Relational Latent Semantic Analysis.
Kai-Wei ChangJoint work with Scott Wen-tau Yih, Chris MeekMicrosoft Research
![Page 2: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/2.jpg)
Natural Language UnderstandingBuild an intelligent system that can interact with human using natural language
Research challengeMeaning representation of textSupport useful inferential tasks
Semantic word representation is the foundation
Language is compositionalWord is the basic semantic unit
![Page 3: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/3.jpg)
Continuous Semantic Representations
A lot of popular methods for creating word vectors!
Vector Space Model [Salton & McGill 83]Latent Semantic Analysis [Deerwester+ 90]Latent Dirichlet Allocation [Blei+ 01]Deep Neural Networks [Collobert & Weston 08]
Encode term co-occurrence informationMeasure semantic similarity well
![Page 4: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/4.jpg)
Continuous Semantic Representations
sunnyrainy
windycloudy
car
wheel
cab sad
joy
emotion
feeling
![Page 5: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/5.jpg)
Semantics Needs More Than SimilarityTomorrow
will be rainy. Tomorrow
will be sunny.
rainy, sunny?
rainy, sunny?
![Page 6: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/6.jpg)
Leverage Linguistic ResourcesCan’t we just use the existing linguistic resources?
Knowledge in these resources is never completeOften lack of degree of relations
Create a continuous semantic representation that
Leverages existing rich linguistic resourcesDiscovers new relationsEnables us to measure the degree of multiple relations (not just similarity)
![Page 7: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/7.jpg)
RoadmapIntroductionBackground:
Latent Semantic Analysis (LSA)Polarity Inducing LSA (PILSA)
Multi-Relational Latent Semantic Analysis (MRLSA)
Encoding multi-relational data in a tensorTensor decomposition & measuring degree of a relation
ExperimentsConclusions
![Page 8: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/8.jpg)
RoadmapIntroductionBackground:
Latent Semantic Analysis (LSA)Polarity Inducing LSA (PILSA)
Multi-Relational Latent Semantic Analysis (MRLSA)
Encoding multi-relational data in a tensorTensor decomposition & measuring degree of a relation
ExperimentsConclusions
![Page 9: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/9.jpg)
Latent Semantic Analysis [Deerwester+ 1990]
Data representationEncode single-relational data in a matrix
Co-occurrence (e.g., from a general corpus)Synonyms (e.g., from a thesaurus)
FactorizationApply SVD to the matrix to find latent components
Measuring degree of relationCosine of latent vectors
![Page 10: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/10.jpg)
Cosine Score
Encode Synonyms in MatrixInput: Synonyms from a thesaurus
Joyfulness: joy, gladdenSad: sorrow, sadden
joy gladden
sorrow sadden
goodwill
Group 1: “joyfulness”
1 1 0 0 0
Group 2: “sad” 0 0 1 1 0Group 3: “affection”
0 0 0 0 1
Target word: row-vector
Term: column-vector
![Page 11: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/11.jpg)
SVD generalizes the original dataUncovers relationships not explicit in the thesaurusTerm vectors projected to -dim latent space
Word similarity: cosine of two column vectors in
Mapping to Latent Space via SVD
𝐖 𝐔 𝐕𝑇≈
𝑑×𝑛 𝑑×𝑘𝑘×𝑘 𝑘×𝑛
𝚺terms
![Page 12: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/12.jpg)
Problem: Handling Two Opposite RelationsSynonyms & AntonymsLSA cannot distinguish antonyms [Landauer
2002]“Distinguishing synonyms and antonyms is still perceived as a difficult open problem.” [Poon & Domingos 09]
![Page 13: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/13.jpg)
Polarity Inducing LSA [Yih, Zweig, Platt 2012]
Data representationEncode two opposite relations in a matrix using “polarity”
Synonyms & antonyms (e.g., from a thesaurus)
FactorizationApply SVD to the matrix to find latent components
Measuring degree of relationCosine of latent vectors
![Page 14: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/14.jpg)
joy gladden
sorrow sadden
goodwill
Group 1: “joyfulness”
1 1 -1 -1 0
Group 2: “sad” -1 -1 1 1 0Group 3: “affection”
0 0 0 0 1
Encode Synonyms & Antonyms in MatrixJoyfulness: joy, gladden; sorrow, sadden
Sad: sorrow, sadden; joy, gladdenInducing polarity
Cosine Score:
Target word: row-vector
![Page 15: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/15.jpg)
joy gladden
sorrow sadden
goodwill
Group 1: “joyfulness”
1 1 -1 -1 0
Group 2: “sad” -1 -1 1 1 0Group 3: “affection”
0 0 0 0 1
Encode Synonyms & Antonyms in MatrixJoyfulness: joy, gladden; sorrow, sadden
Sad: sorrow, sadden; joy, gladdenInducing polarityTarget word: row-
vector
Cosine Score:
![Page 16: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/16.jpg)
Limitation of the matrix representationEach entry captures a particular type of relation between two entities, orTwo opposite relations with the polarity trick
Encoding other binary relationsIs-A (hyponym) – ostrich is a birdPart-whole – engine is a part of car
Problem: How to Handle More Relations?
Encode multiple relations in a 3-way tensor (3-dim array)!
![Page 17: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/17.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 18: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/18.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 19: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/19.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 20: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/20.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 21: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/21.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 22: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/22.jpg)
Represent word relations using a tensorEach slice encodes a relation between terms and target words.
0 0 0 00 0 1 01 0 0 00 0 0 0
joy gladd
ensa
dden
feelin
gjoyfulness
gladdensad
anger
gladd
en
joy sadd
enfee
ling
joyfulnessgladden
sadanger
Synonym layer Antonym layer
1 1 0 01 1 0 00 0 1 00 0 0 0
Construct a tensor with two slices
Encode Multiple Relations in Tensor
![Page 23: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/23.jpg)
Can encode multiple relations in the tensor
0 0 0 10 0 0 00 0 0 10 0 0 1
gladd
en
joy sadd
enfee
ling
joyfulnessgladden
sadanger
Hyponym layer
1 1 0 01 1 0 00 0 1 00 0 0 0
1 1 0 01 1 0 00 0 1 00 0 0 0
Encode Multiple Relations in Tensor
![Page 24: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/24.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 25: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/25.jpg)
Derive a low-rank approximation to generalize the data and to discover unseen relationsApply Tucker decomposition and reformulate the results
Tensor Decomposition – Analogy to SVD
𝑤 1,𝑤
2,…,𝑤
𝑑
𝑡1 , 𝑡2 , …, 𝑡𝑛
~~ × ×
𝑡1 , 𝑡2 ,…, 𝑡𝑛
𝑟 𝑟𝑟
𝑟
𝑤 1,𝑤
2,…,𝑤
𝑑
latent representation of words
![Page 26: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/26.jpg)
𝑤 1,𝑤
2,…,𝑤
𝑑
𝑡1 , 𝑡2 , …, 𝑡𝑛
~~ × ×
𝑡1 , 𝑡2 ,…, 𝑡𝑛
𝑟 𝑟𝑟
𝑟
Derive a low-rank approximation to generalize the data and to discover unseen relationsApply Tucker decomposition and reformulate the results
~~ × ×
𝑟𝑟
𝑅1 ,𝑅
2 ,…,𝑅
𝑚 𝑟
latent representation of words
latent representation of a relation
Tensor Decomposition – Analogy to SVD
![Page 27: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/27.jpg)
Multi-Relational LSAData representation
Encode multiple relations in a tensorSynonyms, antonyms, hyponyms (is-a), … (e.g., from a linguistic knowledge base)
FactorizationApply tensor decomposition to the tensor to find latent components
Measuring degree of relationCosine of latent vectors after projection
![Page 28: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/28.jpg)
Measure Degree of RelationSimilarity
Cosine of the latent vectors
Other relation (both symmetric and asymmetric)
Take the latent matrix of the pivot relation (synonym)Take the latent matrix of the relationCosine of the latent vectors after projection
![Page 29: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/29.jpg)
𝑎𝑛𝑡 ( joy ,sadden )=cos (𝓦: , joy , 𝑠𝑦𝑛 ,𝓦: ,sadden ,𝑎𝑛𝑡 )
0 0 0 00 0 1 01 0 0 00 0 0 0
joy gladd
ensa
dden
fellin
gjoyfulness
gladdensad
anger
gladd
en
joy sadd
enfel
ling
joyfulnessgladden
sadanger
Synonym layer Antonym layer
1 1 0 01 1 0 00 0 1 00 0 0 0
Measure Degree of RelationRaw Representation
![Page 30: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/30.jpg)
𝑎𝑛𝑡 ( joy ,sadden )=cos (𝓦: , joy , 𝑠𝑦𝑛 ,𝓦: ,sadden ,𝑎𝑛𝑡 )
0 0 0 00 0 1 01 0 0 00 0 0 0
joy gladd
ensa
dden
fellin
gjoyfulness
gladdensad
anger
gladd
en
joy sadd
enfel
ling
joyfulnessgladden
sadanger
Synonym layer Antonym layer
1 1 0 01 1 0 00 0 1 00 0 0 0
Measure Degree of RelationRaw Representation
![Page 31: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/31.jpg)
𝐻𝑦𝑝𝑒𝑟 ( joy , feeling )=cos (𝑾 : , joy ,𝑠𝑦𝑛 ,𝑾 : ,feeling , h𝑦𝑝𝑒𝑟 )
joy gladd
ensa
dden
fellin
gjoyfulness
gladdensad
angerSynonym layer
1 0 0 01 1 0 00 0 1 00 0 0 0
Estimate the Degree of a RelationRaw Representation
0 0 0 10 0 0 00 0 0 10 0 0 1
gladd
en
joy sadd
enfee
ling
joyfulnessgladden
sadanger
Hypernym layer
![Page 32: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/32.jpg)
𝑟𝑒𝑙 ( w 𝑖 ,w 𝑗 )=cos (𝑊 : , w 𝑖 , 𝑠𝑦𝑛 ,𝑊 : , w 𝑗 ,𝑟𝑒𝑙 )
Measure Degree of RelationRaw Representation
w 𝑗w𝑖
Synonym layer
The slice of the specific relation
![Page 33: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/33.jpg)
Cos ( , )
~~ × ×
𝑣1 ,𝑣2, …,𝑣𝑛
𝑟𝑟
𝑆,𝑆
2 ,…,𝑆𝑚 𝑟~~
× ×
𝑟𝑒𝑙 ( w 𝑖 ,w 𝑗 )=cos (𝑺:, : ,𝑠𝑦𝑛𝐕𝑖 , :𝑇 ,𝑺: , :, 𝑟𝑒𝑙𝐕 𝑗 ,:
𝑇 )
Measure Degree of RelationLatent Representation
w 𝑗w𝑖
R 𝑠𝑦𝑛
R 𝑟𝑒𝑙 v 𝑗 v 𝑖
![Page 34: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/34.jpg)
RoadmapIntroductionBackground:
Latent Semantic Analysis (LSA)Polarity Inducing LSA (PILSA)
Multi-Relational Latent Semantic Analysis (MRLSA)
Encoding multi-relational data in a tensorTensor decomposition & measuring degree of a relation
ExperimentsConclusions
![Page 35: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/35.jpg)
Experiment: Data for Building MRLSA ModelEncarta Thesaurus
Record synonyms and antonyms of target wordsVocabulary of 50k terms and 47k target words
WordNetHas synonym, antonym, hyponym, hypernym relationsVocabulary of 149k terms and 117k target words
Goals:MRLSA generalizes LSA to model multiple relationsImprove performance by combing heterogeneous data
![Page 36: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/36.jpg)
Example Antonyms Output by MRLSA
Target High Score Wordsinanimate
alive, living, bodily, in-the-flesh, incarnate
alleviate exacerbate, make-worse, in-flame, amplify, stir-up
relish detest, abhor, abominate, despise, loathe
* Words in blue are antonyms listed in the Encarta thesaurus.
![Page 37: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/37.jpg)
Results – GRE Antonym TestTask: GRE closest-opposite questions
Which is the closest opposite of adulterate?(a) renounce (b) forbid (c) purify (d) criticize (e) correct
Mohammad et al. 08 Lookup PILSA MRLSA0.5
0.550.6
0.650.7
0.750.8
0.85
0.64
0.56
0.74 0.77
Accu
racy
![Page 38: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/38.jpg)
Example Hyponyms Output by MRLSATarget High Score Wordsbird ostrich, gamecock, nighthawk, amazon,
parrotautomobile
minivan, wagon, taxi, minicab, gypsy cab
vegetable buttercrunch, yellow turnip, romaine, chipotle, chilli
![Page 39: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/39.jpg)
Results – Relational Similarity (SemEval-2012)Task: Class-Inclusion Relation ( is-a kind of )
Most/least illustrative word pairs(a) art:abstract (b) song:opera (c) footwear:boot (d) hair:brown
UTD Lookup MRLSA0.3
0.350.4
0.450.5
0.550.6
0.340.37
0.56
Accu
racy
![Page 40: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/40.jpg)
Relational domain knowledge – the entity type
Relation can only hold between the right types of entities
Words having is-a relation have the same part-of-speechFor relation born-in, the entity types are: (person, location)
Leverage type information to improve MRLSA
Idea #3: Change the objective function
Problem: Use Relational Domain Knowledge
![Page 41: Multi-Relational Latent Semantic Analysis .](https://reader036.fdocuments.us/reader036/viewer/2022062222/568161e1550346895dd1f6cd/html5/thumbnails/41.jpg)
ConclusionsContinuous semantic representation that
Leverages existing rich linguistic resourcesDiscovers new relationsEnables us to measure the degree of multiple relations
ApproachesBetter data representationMatrix/Tensor decomposition
Challenges & Future WorkCapture more types of knowledge in the modelSupport more sophisticated inferential tasks