Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: [email protected] Natural...
-
Upload
ashly-surgener -
Category
Documents
-
view
215 -
download
1
Transcript of Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: [email protected] Natural...
Danica Damljanović, Milan Agatonović, Hamish Cunningham
contact: [email protected]
Natural Language Interfaces to Ontologies: Combining Syntact ic Analysis and
Ontology-Based Lookup through the User Interact ion
ESWC 2010
2 WEB OF DATA
Large datasets such as Linked Open Data available
How can we use these data?
Modigliani test: “tell me the locations of all the original paintings of Modigliani” (Richard MacManus, ReadWriteWeb)
03 JUNE 2010
ESWC 2010
3
PREFIX fb: <http://rdf.freebase.com/ns/>PREFIX dbpedia: <http://dbpedia.org/resource/>PREFIX dbp-prop: <http://dbpedia.org/property/>PREFIX dbp-ont: <http://dbpedia.org/ontology/>PREFIX umbel-sc: <http://umbel.org/umbel/sc/>PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX ot: <http://www.ontotext.com/>
SELECT DISTINCT ?painting_l ?owner_l ?city_fb_con ?city_db_loc ?city_db_citWHERE { ?p fb:visual_art.artwork.artist dbpedia:Amedeo_Modigliani ; fb:visual_art.artwork.owners [ fb:visual_art.artwork_owner_relationship.owner ?ow ] ; ot:preferredLabel ?painting_l. ?ow ot:preferredLabel ?owner_l . OPTIONAL { ?ow fb:location.location.containedby [ ot:preferredLabel ?city_fb_con ] } . OPTIONAL { ?ow dbp-prop:location ?loc. ?loc rdf:type umbel-sc:City ; ot:preferredLabel ?
city_db_loc } OPTIONAL { ?ow dbp-ont:city [ ot:preferredLabel ?city_db_cit ] }}
03 JUNE 2010
PASSING MODIGLIANI TEST
Source:http://blog.larkc.eu/: “LDSR Passes the Modigliani Test for Semantic Web”, more than 1h to generate a SPARQL query
ESWC 2010
4PASSING MODIGLIANI
TEST: FUTURE
03 JUNE 2010
“tell me the locations of all the original paintings of Modigliani”
ESWC 2010
5BUT, OTHERS HAVE ALREADY DONE IT?
03 JUNE 2010
low precisionhigh recall
low precisionlow recall
high precisionhigh recall
high precisionlow recall
large datasets (several domains)
simple factual questions
complex questions
small datasets(narrow domain)
(Damljanović and Bontcheva, 2009.)
ESWC 2010
6
FREYA (FEEDBACK, REFINEMENT, EXTENDED
VOCABULARY AGGREGATOR)
Increase recall by:
generating the dialog whenever an “unknown” term appears in the question
Increase precision by:
generating the dialog whenever one term refers to more than one concept in the ontology
The dialog is generated by combining the language of the user and the ontology
Learn from the dialog
03 JUNE 2010
ESWC 2010
8 FREYA WORKFLOW
03 JUNE 2010
answer
answer
NL query
POCsOCs
triples
SPARQL
Potential Ontology Concept (POC)
Ontology Concept (OC)
learn
ESWC 2010
10 FINDING POCS
03 JUNE 2010
ESWC 2010
11FINDING OCS
03 JUNE 2010
13
geo:City
geo:State new york
POC
POC
population
geo:cityPopulation
MAPPING POC TO OCS
03 JUNE 2010ESWC 2010
geo:State
ESWC 2010
14 NEW YORK IS A CITY
03 JUNE 2010
ESWC 2010
15 NEW YORK IS A STATE
03 JUNE 2010
ESWC 2010
16
POC
POC
POC
state
areageo:stateArea
geo:State
geo:isLowestPointOf
point
THE USER CONTROLS THE OUTPUT
03 JUNE 2010
maxgeo:LoPoint
geo:loElevation
min
ESWC 2010
17WHAT IS THE LOWEST POINT OF THE
STATE WITH THE LARGEST AREA?
03 JUNE 2010
TRIPLES:?firstJoker – geo:isLowestPointOf – geo:Stategeo:State – (max) geo:stateArea - ?lastJoker
SPARQL:prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>prefix xsd: <http://www.w3.org/2001/XMLSchema#>select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1} . filter (?p0=<http://www.mooney.net/geo#isLowestPointOf>) . } ?c1 rdf:type <http://www.mooney.net/geo#State> . ?c1 ?p2 ?lastJoker . filter (?p2=<http://www.mooney.net/geo#stateArea>) . } ORDER BY DESC(xsd:double(?lastJoker)) however...
ESWC 2010
18WHAT IS THE LOWEST POINT OF THE
STATE WITH THE LARGEST AREA?
03 JUNE 2010
TRIPLES:?firstJoker – (min) geo:loElevation – geo:LoPointgeo:LoPoint - ?joker3 – geo:Stategeo:State – (max) geo:stateArea - ?lastJoker
SPARQL:prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>prefix xsd: <http://www.w3.org/2001/XMLSchema#>select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker . filter (?p0=<http://www.moony.net/geo#loElevation>) . ?c1 rdf:type <http://www.mooney.net/geo#LoPoint> . {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type <http://www.mooney.net/geo#State> . ?c2 ?p3 ?lastJoker . filter (?p3=<http://www.mooney.net/geo#stateArea>) . } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker))
the answer for both is Death Valley
ESWC 2010
19
FREYA: A NATURAL LANGUAGE INTERFACE TO ONTOLOGIES
03 JUNE 2010
http://gate.ac.uk/freya
ESWC 2010 21
EVALUATION
• correctness
• ranked suggestions
• learning
03 JUNE 2010
ESWC 2010
22EVALUATION:
CORRECTNESS
Mooney
GeoQuery
dataset:
250 questions
03 JUNE 2010
19
32
127
72
Precision = Recall = 92.4%
incorrect2 dialogs1 dialogno dialog
ESWC 2010
23EVALUATION:
SUGGESTIONS RANKING
Mooney GeoQuery dataset: 250 questions
Manually labelled correct rankings
Mean Reciprocal Rank (MRR): 0.81
03 JUNE 2010
ESWC 2010
24EVALUATION:
LEARNING
103 questions correctly answered by engaging the user into 1 dialog
MRR 0.72
03 JUNE 2010
ESWC 2010
25EVALUATION:
LEARNING
MRR improved from 0.72 to 0.78
03 JUNE 2010
ESWC 2010
26 NEXT STEPS
Passing Modigliani test
Exploring unknown data structures with FREyA, especially if they are large LDSR: DBPedia, Freebase, Geonames, UMBEL,
Wordnet, CIA World Factbook, Lingvoj, MusicBrainz
http://ontotext.com/ldsr
User-centric evaluation
03 JUNE 2010
27
Contact: [email protected]
THANK YOU FOR YOUR ATTENTION! QUESTIONS?
Thanks to Abraham Bernstein and Esther Kaufmann from the University of Zurich, for sharing with us Mooney dataset in OWL format, and J. Mooney from University of Texas for making this dataset publicly available.
ESWC 2010
28 REFERENCES
Damljanovic, D., Bontcheva, K.: Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases. In Devedzic V. and Gasevic D. (Eds.), Special issue on Semantic Web and Web 2.0, Annals of Information systems, Springer-Verlag, 2009.
03 JUNE 2010