Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan...
-
Upload
samson-curtis -
Category
Documents
-
view
215 -
download
0
Transcript of Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan...
![Page 1: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/1.jpg)
Triplet Extraction from Triplet Extraction from SentencesSentences
Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan“Jožef Stefan” Institute, Ljubljana, Slovenia
Assist. Prof. Dr. Dunja MladenićBlaž FortunaMarko Grobelnik
Lorand Dali June 2008
![Page 2: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/2.jpg)
Location of the project in the Location of the project in the field of Computer Sciencefield of Computer Science
Artificial IntelligenceNatural Language ProcessingMachine Learning
![Page 3: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/3.jpg)
My My fatherfather carriescarries around the around the picturepicture of the of the kidkid who who camecame with his with his walletwallet..
![Page 4: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/4.jpg)
Motivation of Triplet ExtractionMotivation of Triplet Extraction
Advantages◦ compact and simple representation of the
information contained in a sentence◦ avoids the complexity of a full parse◦ contains semantic information
Applications◦ building the semantic graph of a document◦ summarization◦ question answering
![Page 5: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/5.jpg)
![Page 6: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/6.jpg)
![Page 7: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/7.jpg)
Triplet Extraction – 2 Triplet Extraction – 2 ApproachesApproachesExtraction from the parse tree of the
sentence using heuristic rules◦ OpenNLP – Treebank Parsetree◦ Link Parser – Link Grammar (a type of dependency
grammar)
Extraction using Machine Learning◦ Support Vector Machines (SVM) are used◦ The SVM model is trained on human annotated data
![Page 8: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/8.jpg)
Short review of SVMShort review of SVM
![Page 9: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/9.jpg)
![Page 10: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/10.jpg)
Features of the triplet Features of the triplet candidatescandidatesOver 300 features depending on:Sentence
◦ length of sentence, number of words, etcCandidate
◦ context of Subj, Verb and Obj;◦ distance between Subj, Verb, Obj
Linkage◦ number of links, of link types, nr of links from S, V, O
Minipar◦ depth, diameter, siblings, uncles, cousins, categories,
relations
Treebank◦ depth, diameter, siblings, uncles, cousins, path to root, POS
![Page 11: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/11.jpg)
Evaluation and TestingEvaluation and TestingTraining set = 700 annotated sentences
Test set = 100 annotated sentences
Compare the extracted triplets from a sentence to the annotated triplets from that same sentence
Comparison is done according to a similaritry measure [0, 1] between two triplets
extracted to annotated => precision
annotated to extracted => recall
![Page 12: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/12.jpg)
![Page 13: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/13.jpg)
ConclusionsConclusions
Triplet extraction using hand rulesTriplet extraction using machine
learning (SVM)Question answering system based on
triplets
![Page 14: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/14.jpg)
QuestionsQuestions
![Page 15: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/15.jpg)
![Page 16: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/16.jpg)
![Page 17: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/17.jpg)
![Page 18: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/18.jpg)
![Page 19: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/19.jpg)
Triplet Similarity MeasureTriplet Similarity Measure
S V O
S’ V’ O’
SubjSim VerbSim ObjSim
TrSim = (SubjSim + VerbSim + ObjSim) / 3
TrSim, SubjSim, VerbSim, ObjSim [0, 1]
![Page 20: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/20.jpg)
String Similarity MeasureString Similarity Measure
The way to success is under heavy construction
The road to success is always under construction
road success under construction
way success under heavy construction
Sim = nMatch / maxLen = 3 / 5 = 0.6
![Page 21: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/21.jpg)
Evaluating the extracted Evaluating the extracted tripletstriplets
Sentence Sentence
Tr1
Tr2
Tr3
Tr1
Tr2
Precision
Recall
Extracted Golden Standard
![Page 22: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/22.jpg)
My My fatherfather carriescarries around the around the picturepicture of the of the kidkid who who camecame with his with his walletwallet..
![Page 23: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/23.jpg)
![Page 24: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/24.jpg)
![Page 25: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/25.jpg)
![Page 26: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/26.jpg)
![Page 27: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/27.jpg)
Question TypesQuestion TypesYes/No QuestionsList QuestionsReason QuestionsQuantity QuestionsLocation QuestionsTime Questions
![Page 28: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/28.jpg)
Block Diagram of QA Block Diagram of QA SystemSystem
Parse and
determine
question type
BuildQuery
SearchTriplets
Question Answer
![Page 29: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/29.jpg)
![Page 30: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/30.jpg)
![Page 31: Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f335503460f94c50af8/html5/thumbnails/31.jpg)
If a If a listenerlistener nodsnods his his headhead while while youyou're 're explainingexplaining your your programprogram; wake him up.; wake him up.