Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating...
Transcript of Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating...
![Page 1: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/1.jpg)
Final Projects
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
lcl.uniroma1.it/disambiguated-glosses
![Page 2: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/2.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
2
![Page 3: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/3.jpg)
Definitional Knowledge in NLP
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
○ Word Sense Disambiguation
○ Taxonomy/Ontology Learning
○ Information Extraction
○ Plagiarism Detection
○ Question Answering
○ ...
3
Lesk, 1986Banerjee and Pedersen, 2002
Navigli and Velardi, 2005
Agirre and Soroa, 2009
Fernandez-Ordonez et al., 2012Chen et al., 2014
Camacho-Collados et al., 2015Velardi et al., 2013
Flati et al., 2014Espinosa-Anke et al., 2016
Richardson et al., 1998
Delli Bovi et al., 2015
Franco-Salvador et al., 2016
Hill et al., 2015
![Page 4: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/4.jpg)
○ WordNet/Open Multilingual WordNet150k definitions in 5 languages
Wiktionary 285k definitions in 1 language
Wikidata 8M definitions in 255 languages
OmegaWiki 118k definitions in 89 languages
Wikipedia >30M definitions in 264 languages
Definitions and glosses are everywhere!
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
4
![Page 5: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/5.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
5
Our goal:✓ Construct a large-scale, multilingual repository of glosses and
definitions with sense annotations
![Page 6: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/6.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
5
Our goal:✓ Construct a large-scale, multilingual repository of glosses and
definitions with sense annotations
○ The largest multilingual encyclopedic dictionary and semantic network
○ Merger of 13 different knowledge resources
○ >35M definitions in >250 languages!
babelnet.org
![Page 7: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/7.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
5
Our goal:✓ Construct a large-scale, multilingual repository of glosses and
definitions with sense annotations
○ The largest multilingual encyclopedic dictionary and semantic network
○ Merger of 13 different knowledge resources
○ >35M definitions in >250 languages!
babelnet.org
How?
![Page 8: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/8.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
6
Problem:○ Disambiguating definitions is hard!
![Page 9: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/9.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
6
Problem:○ Disambiguating definitions is hard!
Interchanging the positions of the king and a rook. Definition of “castling” in chess
(WordNet)
![Page 10: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/10.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
6
Problem:○ Disambiguating definitions is hard!
Interchanging the positions of the king and a rook. Definition of “castling” in chess
(WordNet)Multilingual WSD/EL based on BabelNet (Moro et al., 2014)
![Page 11: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/11.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
6
Our goal:✓ Construct a large-scale, multilingual repository of glosses and
definitions with sense annotations
Problem:○ Disambiguating definitions is hard!
Short and concise, not enough context
![Page 12: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/12.jpg)
Disambiguating glosses on a large scale
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
6
Our goal:✓ Construct a large-scale, multilingual repository of glosses and
definitions with sense annotations
Problem:○ Disambiguating definitions is hard!
Intuition:○ Use various definitions of the same concept or entity at the
same time and in multiple languages
Short and concise, not enough context
![Page 13: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/13.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
7
Step 1: Context-rich Disambiguation
![Page 14: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/14.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
7
○ Multilingual preprocessing pipeline:
● Tokenization: from the Polyglot project (165 languages)
● Part-of-speech tagging: Stanford parser trained on Universal Dependencies (30 languages)
Today at LREC, Session O19!
Step 1: Context-rich Disambiguation
![Page 15: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/15.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
7
○ Multilingual preprocessing pipeline:
● Tokenization: from the Polyglot project (165 languages)
● Part-of-speech tagging: Stanford parser trained on Universal Dependencies (30 languages)
○ Context enrichment:
● Given a definiendum, collect all its definitions in every available language and resource and bring them together into a single, heterogeneous multilingual text!
Step 1: Context-rich Disambiguation
![Page 16: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/16.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
7
○ Multilingual preprocessing pipeline:
● Tokenization: from the Polyglot project (165 languages)
● Part-of-speech tagging: Stanford parser trained on Universal Dependencies (30 languages)
○ Context enrichment:
● Given a definiendum, collect all its definitions in every available language and resource and bring them together into a single, heterogeneous multilingual text!
Step 1: Context-rich Disambiguation
![Page 17: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/17.jpg)
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
8
○ Babelfy (Moro et al., 2014):
● Unified graph-based approach to multilingual Word Sense Disambiguation and Entity Linking
● Designed to handle multilingual text (“language-agnostic” setting)
Step 1: Context-rich Disambiguation
babelfy.org
![Page 18: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/18.jpg)
Step 1: Context-rich Disambiguation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
9
Our running example: castling
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Interchanging the positions of the king and a rook.
![Page 19: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/19.jpg)
Step 1: Context-rich Disambiguation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
9
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
![Page 20: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/20.jpg)
Step 1: Context-rich Disambiguation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
9
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
![Page 21: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/21.jpg)
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
10
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.?
![Page 22: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/22.jpg)
NASARI_embed: Latent semantic representations of BabelNet synsets and Wikipedia pages as 300-dimensional vectors.
(Camacho-Collados, Pilehvar and Navigli, ACL 2015)
- Goal: Re-disambiguate low confidence annotations from the first step.
- How: We obtain the centroid NASARI vector of high-confidence annotations and compute cosine similarity with all the candidate synsets NASARI vectors.
Step 2: Disambiguation Refinement
9
SEMANTIC SIMILARITY
lcl.uniroma1.it/nasari/
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
11
![Page 23: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/23.jpg)
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
12
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
![Page 24: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/24.jpg)
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
12
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
vrook-chess
vrook-chess
vrook-chess
vrook-chess
vrook-chess
vrook-chess
vchess
vchess
vchess
vchess
vchess
![Page 25: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/25.jpg)
ccastling vrook-chess
vrook-chess
vrook-chess
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
12
Our running example: castling
Interchanging the positions of the king and a rook.
vrook-chess
vrook-chess
vrook-chess
vchess
vchess
vchess
vchess
vchess
![Page 26: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/26.jpg)
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
13
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
![Page 27: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/27.jpg)
Step 2: Disambiguation Refinement
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
13
Our running example: castling
Interchanging the positions of the king and a rook.
Castling is a move in the game of chess involving a player’s king and either of the player's original rooks.
Manœuvre du jeu d'échecs
Spielzug im Schach, bei dem König und Turm einer Farbe bewegt werdenEl enroque es un movimiento especial
en el juego de ajedrez que involucra al rey y a una de las torres del jugador.
A move in which the king moves two squares towards a rook, and the rook moves to the other side of the king.
Rošáda je zvláštní tah v šachu, při kterém táhne zároveň král a věž.
Rok İngilizce'de kaleye rook denmektedir.
Rokade er et spesialtrekk i sjakk.
Το ροκέ είναι μια ειδική κίνηση στο σκάκι που συμμετέχουν ο βασιλιάς και ένας από τους δυο πύργους.
Xθ=0.86
θ<0.5
![Page 28: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/28.jpg)
Evaluation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
14
● Extrinsic Evaluation:○ Open Information Extraction (DefIE)○ Sense Clustering (NASARI)
● Manual Intrinsic Evaluation:○ 3 languages (EN, IT, ES)○ Sample of 100 definitions each
![Page 29: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/29.jpg)
Extrinsic Evaluation I: Open Information Extraction
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
15
DefIE
Large-Scale Information Extraction from Textual Definitions through Deep Syntactic and Semantic Analysis (Delli Bovi et al., TACL 2015)
lcl.uniroma1.it/defie/
- DefIE uses disambiguated definitions as input. We simply plugged-in our disambiguated definitions as input and leave its whole pipeline unchanged.
- This leaves to improvements according to both manual and automatic evaluation
![Page 30: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/30.jpg)
Extrinsic Evaluation I: Open Information Extraction
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
16
Evaluation on a sample of 150 definitions
NUMBER OF EXTRACTIONS PRECISION
![Page 31: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/31.jpg)
Extrinsic Evaluation II: Construction of NASARI+
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
17
NASARI semantic representation construction pipeline (ACL 2015)
![Page 32: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/32.jpg)
Extrinsic Evaluation II: Construction of NASARI+
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
17
We simply enrich BabelNet taxonomy with the high-precision disambiguated glosses. The whole pipeline remains unchanged.
+ Glosses
+Glo
sses
![Page 33: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/33.jpg)
Extrinsic Evaluation II: Wikipedia Sense Clustering
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
18
ACCURACY
![Page 34: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/34.jpg)
Intrinsic Evaluation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
19
Manual evaluation on a sample of 300 definitions
PRECISION OF THE THREE DIFFERENT DISAMBIGUATION STRATEGIES
![Page 35: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/35.jpg)
Intrinsic Evaluation
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
19
Manual evaluation on a sample of 300 definitions
PRECISION OF THE THREE DIFFERENT DISAMBIGUATION STRATEGIES
Coverage of the high-precision version of the corpus: ~65% for all PoS and ~75% for nouns
![Page 36: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/36.jpg)
Overview of the release
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
20
● Two different versions of the corpus:○ Complete version before refinement (Step 1)○ High-Precision version after refinement (Step 2)
● Formatted in an easy-to-process XML, divided by language and resource
![Page 37: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/37.jpg)
Overview of the release
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
20
● Two different versions of the corpus:○ Complete version before refinement (Step 1)
![Page 38: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/38.jpg)
Overview of the release
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
20
● Two different versions of the corpus:○ Complete version before refinement (Step 1)
![Page 39: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/39.jpg)
Overview of the release
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
20
● Two different versions of the corpus:○ Complete version before refinement (Step 1)
![Page 40: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/40.jpg)
Overview of the release
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
20
● Two different versions of the corpus:○ Complete version before refinement (Step 1)
○ High-Precision version after refinement (Step 2)
![Page 41: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/41.jpg)
Statistics - #Sense annotations
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
21
Before refinement (Step 1)
249,544,708 annotations
After refinement (Step 2)
163,029,131 annotations
![Page 42: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/42.jpg)
Statistics - #Sense annotations per language
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
22
~58.8M
~37.9M
~8.3M ~10.6M ~8.4M~3.4M
~14.1M~18.2M
~13.4M~5.2M
![Page 43: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/43.jpg)
Conclusion
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
23
A large-scale multilingual corpus of disambiguated glosses:
● 250 milion sense-annotations for both concepts and named entities
● In total, over 35 million definitions have been disambiguated
● 256 languages
● Both versions of the corpus freely available online
PLAY WITH ME!
http://lcl.uniroma1.it/disambiguated-glosses/
![Page 44: Final Projects - GitHub Pages · definitions with sense annotations Problem: Disambiguating definitions is hard! Intuition: Use various definitions of the same concept or entity at](https://reader034.fdocuments.us/reader034/viewer/2022043000/5f7543d98295b937ab45fa24/html5/thumbnails/44.jpg)
Thank you!
A Large-Scale Multilingual Disambiguation of GlossesJosé Camacho Collados, Claudio Delli Bovi, Alessandro Raganato, and Roberto Navigli
http://lcl.uniroma1.it/disambiguated-glosses/