Presentacion tesisAngel-revisada
-
Upload
angel-conde-manjon -
Category
Documents
-
view
114 -
download
0
Transcript of Presentacion tesisAngel-revisada
![Page 1: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/1.jpg)
LiDom Builder: Automatising the Construction of Multilingual Domain Modules
Ángel Conde ManjónGaLan Research Group – LSI Department
University of the Basque Country (UPV/EHU)
Supervisors:Dr. Mikel Larrañaga Olagaray & Dr. Ana Arruarte Lasa
UPV/EHU
25 February 2016
![Page 2: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/2.jpg)
• Technology Supported Learning Systems (TSLS)• Learning Management Systems: • Massive Open Online Courses: • Intelligent Tutoring Systems: SQL-Tutor• …
• Bilingual and Multilingual Contexts are a reality (Unesco, 2003)
• Acquiring the Domain Module is a cost and work intensive task
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Context
2
![Page 3: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/3.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Main Goal
Automatising the construction of MULTILINGUAL DOMAIN MODULES
![Page 4: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/4.jpg)
DOM-Sortze (Larrañaga, 2012) a framework for building DOMAIN MODULES from electronic textbooks
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Previous Work: DOM-Sortze
![Page 5: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/5.jpg)
Electronic Textbook
LDO Gathering
Preprocess
LOs Gathering
Domain Module
Document Body Internal Representation
Document Outline Internal Representation
Learning Domain Ontology
1
23
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Previous Work: DOM-Sortze
![Page 6: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/6.jpg)
Planetary System Solar System
Moon
Satellite
Planet Earth
partOfpartOfpartOf
isA
isAprerequisite
The Moon is Earth's only natural satellite
LO1
hasDR
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
DOM-Sortze: Domain Module Representation Formalism
Learning Domain Ontology (LDO)Topics and pedagogical relationships
Learning Objects (LO)• Definitions• Examples• Problem Statements• …
![Page 7: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/7.jpg)
Limitations of DOM-Sortze:
1. Developed for a single language: Basque.
2. Its formalism is not able to represent Multilingual Domain Modules.
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
DOM-Sortze: Limitations
![Page 8: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/8.jpg)
1. Can be the formalism used in DOM-Sortze be enhanced for Multilingual Domain Modules?
– Extend the formalism to deal with Multilingual Domain Modules.
2. Which enhancements are required to deal with various languages?
– Develop a method for extracting Multilingual Terminology.
– Improve the Relationship Acquisition.
– Provide a method for acquiring Multilingual Learning Objects.
Automatising the construction of MULTILINGUAL DOMAIN MODULES
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Goals
![Page 9: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/9.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
I. Introduction: Motivations and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
Outline
![Page 10: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/10.jpg)
I. Introduction: Motivations and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Outline
![Page 11: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/11.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Multilingual Terminology Extraction
Pedagogical Relationship Extraction
Textbook
Multilingual Learning Object
Generation
LiDom Builder
Overview
LiDom Builder: framework for automatising the acquisition of Multilingual Domain Modules
Domain Module
![Page 12: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/12.jpg)
Equiv. “en”Equiv. “es”
Planetary System Solar System
Moon
Satellite
Planet Earth
partOfpartOf partOf
isA
isAprerequisite
pedagogically
Close
“ilargi”
“luna”
“moon”
LO1 LO2
eu
en
es
hasDR hasDR
@
@ @
@
@
@
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Multilingual Domain Module Formalism
![Page 13: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/13.jpg)
Language Identification
LDO Gathering
Electronic Textbook
Preprocess
LOs Gathering
Document Internal Representation
Document Outline Internal Representation
1
23
Domain ModuleLearning Domain Ontology
NLP Parsers Illinois ChunkerIllinois POS taggerFreeLingIXA-Pipes
Topic ExtractionRelationship ExtractionSet of HeuristicsGrammar
Multilingual LOsGrammar Discourse Markers
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Proposed Enhancements
LiTeWi
LiReWi
LiLoWi
0
![Page 14: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/14.jpg)
12
Electronic Textbook
LDO Gathering
Preprocess
LOs Gathering
Document Internal Representation
Document Outline Internal Representation
1
23
Domain ModuleLearning Domain Ontology
Knowledge Resources
…..
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Proposed Enhancements
![Page 15: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/15.jpg)
• Two phases
• Tuning up• Set the thresholds and default confidence values.
• Evaluation• Gold Standard (Recall, Precision, F1-Score).
• Expert validation.
• Use of three textbooks
1. Programming: Introduction to Object Oriented Programming (Wong .S, 2010).
2. Astronomy: Introduction to Astronomy (Morison, 2008).
3. Biology: Introduction to Molecular Biology (Raineri,2010).
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
General Evaluation Methodology
![Page 16: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/16.jpg)
I. Introduction: Motivation and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Outline
![Page 17: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/17.jpg)
In DOM-Sortze, terminology extracted with ErauzTerm (Alegria et al., 2004).
A new tool called LiTeWi has been developed.
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Acquisition of Multilingual Terminology
![Page 18: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/18.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiTeWi
TF-IDF KP-Miner CValue Shallow Parsing Grammar
Electronic TextbookCandidate Extraction
Generic Corpus
Mapping
Disambiguation
Filtering
Mapping to other languagesCandidate Selection
Combination
![Page 19: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/19.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Shallow Parsing Algorithm
• Uses a derived grammar from (Larrañaga, 2012).
Constraint Grammar applied
to POS tagsShallow Parser
TopicsArray ListStack………
GrammarTopic + [*]+ part of + [det] +Topic……………….
Textbook
Sentences may contain topicsThis is called an Array ListA Stack is used to model systems that exhibit LIFO…
Extraction Rules
Chunksan Array ListA Stack…….
![Page 20: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/20.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiTeWi
TF-IDF KP-Miner CValue Shallow Parsing Grammar
Electronic TextbookCandidate Extraction
Mapping
Disambiguation
Filtering
Mapping to other languages
Generic Corpus
Candidate Selection
Combination
20
![Page 21: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/21.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Mapping
• Terms mapped to their corresponding Wikipedia articles.
• Search procedure to match Wikipedia article titles and their labels.
![Page 22: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/22.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiTeWi
TF-IDF KP-Miner CValue Shallow Parsing Grammar
Electronic TextbookCandidate Extraction
Mapping
Disambiguation
Filtering
Mapping to other languages
Generic Corpus
Candidate Selection
Combination
22
![Page 23: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/23.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Disambiguation
• Method based on global disambiguation (Milne et al., 2008).
• Domain knowledge step added to improve the results.
• Use as a disambiguation context the domain important terms.
• Gold Term List: Domain important terms with only one sense.
Monosemic terms that have highest CValue score.
![Page 24: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/24.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Disambiguation
WikiminerCompare Service
Term List (to disambiguate)-Java
- Inheritance-Property
Disambiguated Term -Java (programming Language)
Gold Term List-Class
-Programming Language-Array List
Class Prog. Lang.
Array List
Prog. Language 0.90 0.85 0.64
Island 0.7 0.77 0.53
City 0.56 0.75 0.6
Average
0.890.70
0.63
-Java
![Page 25: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/25.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiTeWi
TF-IDF KP-Miner CValue Shallow Parsing Grammar
Electronic TextbookCandidate Extraction
Mapping
Disambiguation
Filtering
Mapping to other languages
Generic Corpus
Candidate Selection
Combination
![Page 26: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/26.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Filtering Unwanted Terms
WikiminerCompare Service
Number of Related Gold Terms
Gold Term List-Solar System- Black Hole-Solar Mass
Term List (to filter)-Universal Studios
-Planet-Windows 98
Relatedness Score-Planet -Windows 98
Domain Related Term
-Planet
-Planet
N(>1)
Threshold(>=0.6)
Solar System (0.34)
Black Hole (0.53)
Solar Mass (0.47)
Solar System (0.23)
Black Hole (0.68)
Solar Mass (0.50)
-Universal Studios
-Windows 98
![Page 27: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/27.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiTeWi
TF-IDF KP-Miner CValue Shallow Parsing Grammar
Electronic TextbookCandidate Extraction
Mapping
Disambiguation
Filtering
Mapping to other languages
Generic Corpus
Candidate Selection
Topic EN ES EUMoon Moon Luna Ilargia
Combination
![Page 28: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/28.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Evaluation
Tuning up
• Introduction to Object Oriented Programming textbook.
Evaluation
• Gold Standard and Expert Validation.
• Gold Standard based on the terms appearing on the index of each textbook.
• Evaluated on Introduction to Astronomy and Introduction to Molecular Biology.
![Page 29: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/29.jpg)
IntroductionAcquisition of
Multilingual TerminologyIdentification of
Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Results
Gold-Standard Ex. Validation
Precision (%) Recall (%) F1 Score (%) Correctness (%)
Astronomy 3.55 62.96 6.72 18.55
Mol. Biology 2.24 10.21 3.67 49.27
Gold-Standard Ex. ValidationPrecision (%) Recall (%) F1 Score (%) Correctness (%)
Astronomy 17.96 72.55 28.79 78.77
Mol. Biology 27.09 50.53 87.70 71.65
• Wikifier (Cheng , 2013)
• LiTeWi
![Page 30: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/30.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Outline
I. Introduction: Motivation and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
![Page 31: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/31.jpg)
Introduction
In DOM-Sortze, relationship acquisition for Basque using Shallow Parsing
An adaptation and extension of the Heuristic-based analysis of the outline has been developed.
A new tool called LiReWi has been developed.
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
![Page 32: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/32.jpg)
Heuristic-based analysis of the outline
Document Outlines• Reflects the organization made by the author.• The structure of the outline underlies pedagogical relationships.• Low cost process (summarised).
DOM-Sortze• Each outline item is considered as a domain topic.• By default gathers a partOf relation between an item and its subitems. • Heuristics to detect isA relations.
LiDom Builder• Adaptation to English of heuristics from (Larrañaga et al., 2004).• Improvement of isA identification using Wikitaxonomy (Ponzetto et al., 2007).
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
![Page 33: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/33.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Wikipedia Enhanced Process
………..4.- Structure of polymers / Macromolecules
4.1.- Polymer chemistry4.2.- Molecular weight4.3.- Form, structure and molecular configuration4.3.- Supramolecular arrangement4.4.- Crystalline and amorphous polymers4.5.- Families of polymeric materials
4.5.1.- Thermosettings4.5.2.- Thermoplastics4.5.3.- Elastomers
5.- Phase diagrams / Definitions5.1.- Solid solutions5.2.- Phases rule of Gibbs5.3.- Types of phase diagram
1. Identify groups of sibling nodes
2. Select the groups of leaf nodes in which the partOf relationship has been identified
Thermosettings polymer (Article id= 321827)
Thermoplastic (Article id= 182444)
Elastomer (Article id = 842224)
3. Link and disambiguate each node to a Wikipedia article using Wikiminer (Milne et al., 2012)
Materials scienceElastomersPolymer physics
Polymer physicsPolymer chemistry
4. Process every group using (Ponzetto et al., 2007) taxonomy
5. Infer isA relationship in those groups that share a common ancestor
![Page 34: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/34.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Evaluation
Gold Standard
• 57 document outlines in English from different domains.
• Human instructors defined the optimal output (LDOs).
• Each LDO restricted to the topics of the outline.
![Page 35: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/35.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Results
• Heuristic Analysis
• Heuristic Analysis + Wikipedia Enhanced Process
partOf isA Total
Precision (%) 84.12 78.95 83.85
Recall (%) 98.66 21.20 83.85
partOf isA Total
Precision (%) 89.19 77.30 87.70
Recall (%) 96.49 50.53 87.70
![Page 36: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/36.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Identification of Pedagogical Relationships: LiReWi
Mapping
Topics
Knowledge Bases
LiReWiElectronic Textbook
Candidate Relationship Extraction
Combination & Filtering
![Page 37: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/37.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Mapping
Topic: SyntaxWikipedia id=3206060WordNet id=?
Comparer
Page Rank Disambiguation
SyntaxWordNet id= 6176322
SyntaxWordNet id= 8436203
Final id
Mapped WordNet idreturned =
WordNet id = 6176322
! =
Fernando’s Mappings
Babelnet MappingsWiki Id WordNet id3206060 8436203,…………. ………..……… …………
Wiki Id WordNet id3206060 6176322,…………. ………..……… …………
Mapping To WordNet Disambiguation
Disambiguation Context
WordNet id84362036176322……….
Java, Programming….
![Page 38: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/38.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Identification of Pedagogical Relationships: LiReWi
MappingCandidate
Relationship Extraction
Topics
Knowledge Bases
LiReWiElectronic Textbook
Combination & Filtering
![Page 39: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/39.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Candidate Relationship Extraction
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
NLP data
WikiTaxonomy Extractor
isApartOf
prerequisite
prerequisitepedagogically
-Close
isApartOf
isAisA isApartOf
Candidate Relationships
![Page 40: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/40.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Candidate Relationship Extraction
Path Based Extractors:
Rocky planet
Mars
Planet
(path length=2,confidence=0.9)(path length=1,
confidence=1)
isAisA
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
WikiTaxonomy Extractor
![Page 41: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/41.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Candidate Relationship Extraction
• WikiRelations: Set of tuples that state the relationships between Wikipedia categories.
T Tauri, Star, isA…………Radiation, Radio waves, partOfLight, Electromagnetic radiation, partOf…………Light, Electromagnetic radiation, partOf…………T Tauri star, Star, isA007 license to kill, video games, isA
WikiRelations Tuples
Light partOf Electromagnetic radiation (Confidence=0.7)
Topic: Light Cat1: Light Cat2: …
Topic: Electromagnetic radiation Cat1: Electromagnetic radiation
Topic: ……
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
WikiTaxonomy Extractor
![Page 42: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/42.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Sentences with mentionsEarth is part of the Solar System.……………….
Candidate Relationship Extraction
• Extractor based on the rules defined in (Larrañaga, 2012).
TopicsSolar SystemEarthPlanetMars
Find Mentions Constraint Grammar applied to POS tags
RelationshipsEarth partOf Solar System……………….…………
GrammarTopic + [*]+ part of + [det] +Topic……………….
Textbook
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
WikiTaxonomy Extractor
![Page 43: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/43.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
WikiTaxonomy Extractor
Candidate Relationship Extraction
Textbook
TopicsWavelengthEmission spectrumPlanetSolar System
Find Mentions
Look links in/links out on
WikipediaReasoner
RelationsEmission spectrumpedagogicallyClose Wavelength…………………….
Possible candidates:Wavelength, Emission Spectrum
(2 times)
Sentences with mentions...leading to different radiated wavelengths, make up an emission spectrum. ... the emission spectrum of a particular star, the wavelength of ………………..
Relatedness > threshold
Emission spectrum (link out) WavelengthWavelength (link out) Emission spectrum
![Page 44: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/44.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Candidate Relationship Extraction
Topic1 Topic2 Topic3 Topic4
Topic1 is pedagogicallyClose to Topic2 Topic3 is a prerequisite of Topic4
4
3
4
1
Mentions (Links):-Topic3, 4 mentions -….
Mentions (Links):-Topic4, 1 mentions -….
Mentions (Links):-Topic2, 3 mentions -….
Mentions (Links):-Topic1, 4 mentions -….
WordNet Extractor
WibiExtractor
WikiRelations Extractor
Shallow Parsing Grammar Extractor
SequentialExtractor
WikiTaxonomy Extractor
![Page 45: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/45.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Identification of Pedagogical Relationships: LiReWi
MappingCandidate
Relationship Extraction
Combination & Filtering
Learning Domain Ontology
Topics
Knowledge Bases
LiReWiElectronic Textbook
![Page 46: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/46.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Combination & Filtering Relationships
-Earth isA Planet (WordNet Ex) (Conf=1)-Earth isA Planet (WikiRelations Ex) (Conf=0.8)-Planet isA Earth (WikiTax Ex) (Conf=0.7)-Earth partOf Solar System (WordNet Ex) (Conf=1)-Earth isA Terrestrial Planet (WikiTax Ex) (Conf=0.5)
-Earth isA Planet (WordNet Ex, WikiRelations Ex) (Conf=1)
-Earth partOf Solar System (WordNet Ex) (Conf=1)
Relationships
-Earth isA Planet (WordNet Ex, WikiRelations Ex) (Conf=1)-Planet isA Earth (WikiTax Ex) (Conf=0.7)-Earth partOf Solar System (WordNet Ex) (Conf=1)
-Earth isA Planet (WordNet Ex, WikiRelations Ex) (Conf=1)-Earth partOf Solar System (WordNet Ex) (Conf=1)-Earth isA Terrestrial Planet (WikiTax Ex) (Conf=0.5)
Confidence Combiner
Conflict Resolver
Filter
Final Relationships
Conflict Resolution
Relationships combined
Filter below threshold
-Planet isA Earth (WikiTax Ex) (Conf=0.7)
-Earth isA Terrestrial Planet (WikiTax Ex) (Conf=0.5)
![Page 47: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/47.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Evaluation
Tuning up
• Introduction to Object Oriented Programming textbook.
Evaluation • Gold Standard and Expert Validation.
• Introduction to Astronomy textbook.
• Gold standard, four experts stated the set of relationships.
• Using a subset of the main domain topics according to the score given by LiTeWi.
![Page 48: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/48.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Results
Precision (%) Recall (%) F1-Score (%) ExpertValidation (%)
LiReWi 36.21 50.57 42.42 43.98
DOM-Sortze 63.27 20.74 31.24 N.A.
![Page 49: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/49.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Multilingual Learning Objects
Conclusions and Future WorkLiDom Builder
Outline
I. Introduction: Motivations and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
![Page 50: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/50.jpg)
Gathering Multilingual Learning Objects
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Conclusions and Future WorkLiDom Builder
Introduction
50
In DOM-Sortze, LOs acquisition for Basque using Shallow Parsing.
A Validation of the approach for English has been carried out.
LiLoWi has been developed to move towards the elicitation of Multilingual LOs.
![Page 51: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/51.jpg)
Gathering Multilingual Learning Objects
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Conclusions and Future WorkLiDom Builder
Adapting Learning Object elicitation to English
Basque English
Pattern adibidez, @topic for instance, @topic
Example Uretan, adibidez hidrogeno eta oxigeno atomoak daude.
For instance, there are hydrogen and oxygen atoms in water.
Textbook
TopicsWavelengthEmission spectrumEarth.Solar System Find
Mentions Grammar
Sentences with mentionsEarth is a planet.……………….
Learning Objects
The Moon is Earth's only natural satellite
![Page 52: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/52.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Evaluation
Gold Standard and Expert Validation:
• Evaluated on Introduction to Object Oriented Programming.
• Gold Standard built by some experts.
Two Aspects
• Grammar.• Learning Objects.
![Page 53: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/53.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Evaluation
Definitions Examples Prob. Stat. Princ. Stat. TotalFound 164 1 12 49 226
Correct 138 1 7 35 181
Precision (%) 84.15 100 58.33 71.43 80.09
Recall (%) ExpertValidation (%)
DOM-Sortze 70.31 91.88
LiDom 75.93 86.79
• Grammar
• Learning Objects
![Page 54: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/54.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
LiLoWi
54
Metadata Generator
Multilingual LOs from WordNet/Wikipedia
TopicsSolar SystemEmission spectrumEarth. LO2es
LO1en
LO2en
Equivalents
![Page 55: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/55.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
• Evaluated on the Principles of Object-Oriented Programming.
• Used the same LDO described in the previous experiment.
• Expert Validation.
Two Aspects
How LiLoWi enhanced the LO coverage for the LDO topics.
How many multilingual LOs are extracted.
Evaluation
55
![Page 56: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/56.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future WorkLiDom Builder
Results
56
Definitions ReferencesEnglish Spanish Basque French
Number of topicsTopic coverage (%)
4656.10
3643.90
910.97
3643.90
1214.63
• Grammar + Wikipedia/WordNet
Total Definitions
Number of topics 21 19
Topics coverage (%) 25.61 19.51
• Grammar-based approach
![Page 57: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/57.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future Work
LiDom Builder
I. Introduction: Motivation and GoalsII. LiDom Builder: Building Multilingual Domain
ModulesIII. Acquisition of Multilingual TerminologyIV. Identification of Pedagogical RelationshipsV. Gathering Multilingual Learning ObjectsVI. Conclusions and Future Work
Outline
57
![Page 58: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/58.jpg)
1. Provision of a suitable formalism to represent Multilingual Domain Modules.
2. Developed a method for the elicitation of multilingual terminology.– First term extractor to our knowledge based on searching patterns for
educational content.
3. Relationship Acquisition has been improved.– Extension of outline processor to English + Enhancement with Wikipedia.– Development of LiReWi, a module for the elicitation of pedagogical
relationships for Educational Ontologies.– Developed a state of the art mapper from Wikipedia to WordNet.
4. Developed a method for multilingual LO generation. – Extension of DOM-Sortze for English.– Development of LiLoWi, a module for the elicitation of multilingual LOs using
different knowledge bases.
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future Work
LiDom Builder
Goal Achievement
![Page 59: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/59.jpg)
Conclusions and Future Work
• Automatising the inclusion of new languages.
• Multilingual Learning Object generation from similarity and machine translation techniques.
• Concept Map-Based Learning Object Generation.
• Improvements on each module of LiDom Builder.
Future Work
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning ObjectsLiDom Builder
![Page 60: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/60.jpg)
Conclusions and Future Work
Software Released
Software
• LiTeWi, released with Spanish/English support: https://github.com/Neuw84/LiTe
• Wikipedia/WordNet mapper: https://github.com/Neuw84/Wikipedia2WordNet
• Spanish stemmer: https://github.com/Neuw84/SpanishInflectorStemmer
• Training Data for Wikiminer: https://github.com/Neuw84/Wikipedia353Spanish
• LiReWi: coming soon….
Web Demo
• LiDom builder : http://galan.ehu.es/lidom/
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning ObjectsLiDom Builder
![Page 61: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/61.jpg)
IntroductionAcquisition of Multilingual Terminology
Identification of Pedagogical
Relationships
Gathering Learning Objects
Conclusions and Future Work
LiDom Builder
Publications
A Combined Approach for Eliciting Relationships for Educational Ontologies Using Several Knowledge Bases. Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga. Journal of Knowledge-Based Systems. Submitted.
LiteWi: A Combined Term Extraction Method for Eliciting Educational Ontologies from Textbooks.Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga, Dan Roth. Journal of the Association for Information Science and Technology, 67(2), pp. 380–399, 2016.
Testing Language Independence in the Semiautomatic Construction of Educational Ontologies. Ángel Conde, Mikel Larrañaga, Ana Arruarte, Jon A. Elorriaga. 12th International Conference on Intelligent Tutoring Systems ITS 2014, Springer, Vol. 8474, pp. 545-550, 2014.
Automatic Generation of the Domain Module from Electronic Textbooks. Method and Validation. Mikel Larrañaga, Ángel Conde, Iñaki Calvo, Jon A. Elorriaga, Ana ArruarteIEEE Transactions on Knowledge and Data Engineering, 26(1), pp. 69-82, 2014.
Automating the Authoring of Learning Material in Computer Engineering Education.Ángel Conde, Mikel Larrañaga, Iñaki Calvo, Jon A. Elorriaga, Ana Arruarte. 42nd Frontiers in Education Conference, pp. 1376-1381, 2012.
![Page 62: Presentacion tesisAngel-revisada](https://reader036.fdocuments.us/reader036/viewer/2022070600/589ae05e1a28abee708b4d67/html5/thumbnails/62.jpg)
LiDom Builder: Automatising the Construction of Multilingual Domain
Ángel Conde ManjónGaLan Research Group – LSI department, University of the Basque
Country (UPV/EHU)
Supervisors:Mikel Larrañaga Olagaray & Ana Arruarte Lasa
UPV/EHU