The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work...
-
Upload
arabella-waters -
Category
Documents
-
view
215 -
download
2
Transcript of The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work...
![Page 1: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/1.jpg)
The Semantic Web:there and back again
Tim FininUniversity of Maryland, Baltimore County
Joint work with Lushan Han, Varish Mulwad, Anupam Joshi
http://ebiq.org/r/353
![Page 2: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/2.jpg)
LOD 123: Making the semantic web easier to use
Tim FininUniversity of Maryland, Baltimore County
Joint work with Lushan Han, Varish Mulwad, Anupam Joshi
![Page 3: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/3.jpg)
Semantic Web: then and now
• Ten years ago the we developed complex ontologies used to encode and reason over small datasets of 1000s of facts
• Recently the focus has shifted to using simple ontologies and minimal reasoning over very large datasets of 100s of millions of facts
• Major companies are moving: Google Know-ledge Graph, Facebook Open Graph, Microsoft Satori, Apple Siri KB, IMB Watson KB Linked open data or “Things, not strings”
![Page 4: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/4.jpg)
Linked Open Data (LOD)• Linked data is just RDF data, lots of it,
with a small schema• RDF data is a graph of triples (subject, predicate
– URI URI String: dbr:Barack_Obama dbo:spouse “Michelle Obama”
– URI URI URI: dbr:Barack_Obama dbo:spouse dbpedia:Michelle_Obama
• Best linked data practice prefers 2nd pattern, using nodes rather than strings for “entities”– Things, not strings!
• Linked open data is just linked data freely acces-sible on the Web along with their ontologies
![Page 5: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/5.jpg)
Semantic web technologies allow machines to share data and knowledge using common web language and protocols.
~ 1997
Semantic Web
Semantic Web beginning
Use Semantic Web Technology to publish shared data & knowledge
![Page 6: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/6.jpg)
2007
Semantic Web => Linked Open DataUse Semantic Web Technology to publish shared data & knowledge
Data is inter-linked to support inte-gration and fusion of knowledge
LOD beginning
![Page 7: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/7.jpg)
2008
Semantic Web => Linked Open DataUse Semantic Web Technology to publish shared data & knowledge
Data is inter-linked to support inte-gration and fusion of knowledge
LOD growing
![Page 8: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/8.jpg)
2009
Semantic Web => Linked Open DataUse Semantic Web Technology to publish shared data & knowledge
Data is inter-linked to support inte-gration and fusion of knowledge
… and growing
![Page 9: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/9.jpg)
Linked Open Data
2010
LOD is the new Cyc: a common source of background
knowledge
Use Semantic Web Technology to publish shared data & knowledge
Data is inter-linked to support inte-gration and fusion of knowledge
…growing faster
![Page 10: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/10.jpg)
Linked Open Data
2011: 31B facts in 295 datasets interlinked by 504M assertions on ckan.net
LOD is the new Cyc: a common source of background
knowledge
Use Semantic Web Technology to publish shared data & knowledge
Data is inter-linked to support inte-gration and fusion of knowledge
![Page 11: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/11.jpg)
Exploiting LOD not (yet) Easy
• Publishing or using LOD data hasinherent difficulties for the potential user
– It’s difficult to explore LOD data and to query it for answers
– It’s challenging to publish data using appropriate LOD vocabularies & link it to existing data
• Problem: O(104) schema terms, O(1011) instances
• I’ll describe two ongoing research projects that are addressing these problems
![Page 12: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/12.jpg)
GoRelations:Intuitive Query System
for Linked Data
Research with Lushan Han
http://ebiq.org/j/93
![Page 13: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/13.jpg)
Dbpedia is the Stereotypical LOD•DBpedia is an important example of Linked Open Data–Extracts structured data from Infoboxes in Wikipedia –Stores in RDF using custom ontologies Yago terms
•The major integration point for the entire LOD cloud•Explorable as HTML, but harder to query in SPARQL
DBpedia
![Page 14: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/14.jpg)
Browsing DBpedia’s
Mark Twain
![Page 15: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/15.jpg)
Why it’s hard to query LOD• Querying DBpedia requires a lot of a user– Understand the RDF model– Master SPARQL, a formal query language– Understand ontology terms: 320 classes & 1600 properties !– Know instance URIs (>2M entities !)– Term heterogeneity (Place vs. PopulatedPlace)
• Querying large LODsets overwhelming
• Natural languagequery systems stilla research goal
![Page 16: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/16.jpg)
Goal
• Let users with a basic understanding of RDF query DBpedia and other LOD collections– Explore what data is in the system– Get answers to question– Create SPARQL queries for reuse or adaptation
• Desiderata– Easy to learn and to use– Good accuracy (e.g., precision and recall)– Fast
![Page 17: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/17.jpg)
Key Idea
Structured keyword queries reduce problem complexity:
– User enters a simple graph, and– Annotates the nodes and arcs with
words and phrases
![Page 18: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/18.jpg)
Structured Keyword Queries
• Nodes are entities and links binary relations• Entities described by two unrestricted terms:
name or value and type or concept• Outputs marked with ? • Compromise between a natural language Q&A
system and formal query–Users provide compositional structure of the question–Free to use their own terms to annotate structure
![Page 19: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/19.jpg)
Translation – Step Onefinding semantically similar ontology terms
For each graph concept/relation, generate k most semantically similar ontology classes/properties
Lexical similarity metric based on distributional similarity, LSA, and WordNet
![Page 21: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/21.jpg)
Semantic Textual Similarity task• 2013 lexical and computational semantics conference• Do two sentences have same meaning (0…5)
1: “The woman is playing the violin” vs. “The young lady enjoys listening to the guitar”4: "In May 2010, the troops attempted to invade Kabul” vs. "The US army invaded Kabul on May 7th last year, 2010"
• 2012: 35 teams, 88 runs, 2013: 36 teams, 89 runs• 2250 sentence pairs
from four domains• Our three runs
#1, #2 and #4
Dataset PairingWords Galactus Saiyan
Headlines (750 pairs) 0.7642 (3) 0.7428 (7) 0.7838 (1)
OnWN (561 pairs) 0.7529 (5) 0.7053 (12) 0.5593 (36)
FNWN (189 pairs) 0.5818 (1) 0.5444 (3) 0.5815 (2)
SMT (750 pairs) 0.3804 (8) 0.3705 (11) 0.3563 (16)
Weighted mean 0.6181 (1) 0.5927 (2) 0.5683 (4)
![Page 22: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/22.jpg)
• Assemble best interpretation using statistics of the data
• Use pointwise mutual informa-tion (PMI) between RDF terms in the LOD collection
Measures degree to which two RDF terms co-occur in knowledge base
• In a good interpretation, ontology terms associate like their corresponding user terms connect in the query
Translation – Step Twodisambiguation algorithm
![Page 23: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/23.jpg)
Three aspects are combined to derive an overall goodness measure for each candidate interpretation
Translation – Step Twodisambiguation algorithm
Joint disam-biguation
Resolvingdirection
Link reason-ableness
![Page 24: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/24.jpg)
Translation resultConcepts: Place => Place, Author => Writer, Book => BookProperties: born in => birthPlace, wrote => author (inverse direction)
![Page 25: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/25.jpg)
The translation of a semantic graph query to SPARQL is straightforward given the mappings
SPARQL Generation
Concepts•Place => Place•Author => Writer•Book => Book
Relations•born in => birthPlace•wrote => author
![Page 26: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/26.jpg)
Evaluation• 33 test questions from 2011 Workshop on Question
Answering over Linked Data answerable using DBpedia• Three human subjects unfamiliar with DBpedia translated
the test questions into semantic graph queries• Compared with two top natural language QA systems:
PowerAqua and True Knowledge
![Page 27: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/27.jpg)
Current work
• Baseline system works well for Dbpedia; we’re testing a second use case now
• Current work– Better entity matching– Relaxing the need for type information– A better Web interface with user feedback & advice
• See http://ebiq.org/93 for more information & try our alpha version at http://ebiq.org/GOR
![Page 28: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/28.jpg)
Generating Linked Databy Inferring the
Semantics of Tables
Research with Varish Mulwad
http://ebiq.org/j/96
![Page 29: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/29.jpg)
Goal: Table => LOD*
Name Team Position Height
Michael Jordan Chicago Shooting guard 1.98
Allen Iverson Philadelphia Point guard 1.83
Yao Ming Houston Center 2.29
Tim Duncan San Antonio Power forward 2.11
http://dbpedia.org/class/yago/NationalBasketballAssociationTeams
http://dbpedia.org/resource/Allen_Iverson Player height in meters
dbprop:team
* DBpedia
![Page 30: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/30.jpg)
Goal: Table => LOD*
Name Team Position HeightMichael Jordan Chicago Shooting guard 1.98
Allen Iverson Philadelphia Point guard 1.83
Yao Ming Houston Center 2.29
Tim Duncan San Antonio Power forward 2.11
@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix dbo: <http://dbpedia.org/ontology/> .@prefix yago: <http://dbpedia.org/class/yago/> .
"Name"@en is rdfs:label of dbo:BasketballPlayer ."Team"@en is rdfs:label of yago:NationalBasketballAssociationTeams .
"Michael Jordan"@en is rdfs:label of dbpedia:Michael Jordan .dbpedia:Michael Jordan a dbo:BasketballPlayer .
"Chicago Bulls"@en is rdfs:label of dbpedia:Chicago Bulls .dbpedia:Chicago Bulls a yago:NationalBasketballAssociationTeams .
RDF Linked Data
All this in a completely automated way* DBpedia
![Page 31: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/31.jpg)
Tables are everywhere !! … yet …
The web – 154 million high quality relational tables
Fewer than 1% of the 400K tables at data.gov have rich semantic schemas
![Page 32: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/32.jpg)
Sampling Acronym detection
Pre-processing modules
Query and generate initial mappings
2 1
Generate Linked RDF Verify (optional) Store in a knowledge base & publish as LOD
Joint Inference/Assignment
A Domain Independent Framework
![Page 33: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/33.jpg)
Query and Rank
Rank(String Similarity,
Popularity)Chicago
Boston
Allen Iverson 1. Chicago_Bulls2. Chicago3. Judy_Chicago
possible entitiesfor Chicago
Can be replaced by Domain Specific / other LOD knowledge bases
![Page 34: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/34.jpg)
Generating candidate ‘types’ for Columns
Team
Chicago
Philadelphia
Houston
San Antonio
Class
Instance
1. Chicago Bulls2. Chicago3. Judy Chicago
{dbpedia-owl:Place,dbpedia-owl:City,yago:WomenArtist,yago:LivingPeople,yago:NationalBasketballAssociationTeams }
{dbpedia-owl:Place, dbpedia-owl:PopulatedPlace, dbpedia-owl:Film,yago:NationalBasketballAssociationTeams …. ….. ….. }
{……………………………………………………………. }
dbpedia-owl:Place, dbpedia-owl:City, yago:WomenArtist, yago:LivingPeople, yago:NationalBasketballAssociationTeams, dbpedia-owl:PopulatedPlace, dbpedia-owl:Film ….
![Page 35: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/35.jpg)
Joint Inference / Assignment
![Page 36: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/36.jpg)
A graphical model for tablesJoint inference over evidence in a table
C1 C2 C3
R11
R12
R13
R21
R22
R23
R31
R32
R33
Team
Chicago
Philadelphia
Houston
San Antonio
Class
Instance
![Page 37: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/37.jpg)
Parameterized Graphical Model
C1 C2C3
𝝍𝟓
R11 R12 R13 R21 R22 R23 R31 R32 R33
Function capturing affinity between column headers and row values
Row value
Variable Node: Column header
Captures interaction between column headers
Factor Node
Captures interaction between row values
![Page 38: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/38.jpg)
Standard message passingP(C1, C2, C3, R11, R12 ,R13, R21, R22, R23, R31, R32, R33)Joint Assignment :
P(C1, R11, R12 ,R13)
P(C3, R31, R32, R33)
P(C2,R21, R22, R23)
P(R31, R32, R33)
Graphical Models : Exploit Conditional
Independences
Still …
C1 R11 R12 R13 Val
4 Variables; Each having
25 options -- 390,625
entries !
![Page 39: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/39.jpg)
Semantic message passing
Michael_I_Jordan Chicago_BullsShooting_guard
“Change”Related to Chicago_Bulls
“No Change”“No Change”
R11:[Michael_I_Jordan]
R12:[Yao_Ming]
R13:[Allen_Iverson]
R21:[Chicago_Bulls]
R31:[Shooting_Guard]
……
C1:[BasketballPlayer] C2:[NBATeam] C3:[BasketBallPositions]
Yao_Ming Allen_Iverson
BasketballPlayer
NBATeam BasketBallPositions
“Change”BasketBall
Player
“No Change”
“No Change”
“No Change”“No Change”
“No Change”
“No Change”
![Page 40: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/40.jpg)
Inference – ExampleR11:
[Michael_I_Jordan]
R12:[Allen_Iverson] R13:[Yao_Ming]
C1:[Name]
(Michael_I_Jordan, Yao_Ming, Allen_Iverson)
“Change”“No Change”“No Change”
“BasketBallPlayer”
R11
Michael Jordan
1. Michael_I_Jordan (Professor)2. ….. 3. Michael_Jordan (BasketballPlayer)
….
Michael_I_Jordan Allen_Iverson Yao_Ming
“BasketBallPlayer”
![Page 41: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/41.jpg)
Column header – row value agreement[Michael_I_Jordan, Allen_Iverson, Yao_Ming]
WomenArtistCityPopulatedPlaceAthleteFilm
LivingPeopleGeoPopulatedPlaceBasketBallPlayerArtWork
Name
Michael_I_Jordan
Allen_Iverson
Yao_Ming BasketballPlayerAtheleteLivingPeople
LivingPeopleAI_Researchers +1
+1
+1
+1
+1 1. Athlete2. City 3. …
1.LivingPeople2. BasketBallPlayer3.GeoPopulatedPlace….
Yago Tie-breaker/Re-order : Choose more ‘descriptive’ class. E.g. BasketBallPlayer better than LivingPeople
ClassGranularityScore = 1-[]
1: Majority Voting
2: Choose the top Class
Top Yago : BasketBallPlayer TopDBpedia : Athelete
![Page 42: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/42.jpg)
Column header – row value agreementtopClassScore = (numberOfVotes)/(numberofRows)
Compute topScoreYago & topScoreDBpedia
(both)Score < Threshold(topScoreYago || topScoreDBpedia) >= Threshold
Check for AlignmentIs Athelete sub/superClass of BasketBallPlayer ?
Columnn Header Annotation = BasketBallPlayer, Athlete
Name
Michael_I_Jordan
Allen_Iverson
Yao_Ming BasketballPlayerAtheleteLivingPeople
LivingPeopleAI_Researchers Change
No - Change
Update Column Header Annotation = “No-Annotation”
![Page 43: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/43.jpg)
Update row value entity annotations
R11
Michael Jordan
1. Michael_I_Jordan 2. ….. 3. Michael_Jordan ….
LivingPeopleAI_Researchers
BasketBallPlayerAthlete
𝝍𝟑 “CHANGE”
Entity Class : BasketBallPlayer or
Athlete
R11 Michael Jordan Michael_Jordan
Candidate Entities for R11
![Page 44: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/44.jpg)
Evaluation
• Dataset of 80 tables (Wikipedia tables; part of larger dataset released by IIT-Bombay)
• Evaluated Column Header Annotation Accuracy– How good was the mapping Team to
NationalBasketballAssociationTeams• Evaluated Entity Linking Accuracy
– Mapping Michael Jordan to Michael_Jordan
![Page 45: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/45.jpg)
Column Header Annotation Accuracy
• System produced a ranked a list of Yago & DBpedia classes
• Human judges evaluated each class• For precision, judges scored each class
• 1 if the class was accurate• 0.5 if the class ok, but not best (e.g., Place vs. City)• 0 if it was incorrect
• For Recall, score 1 if accurate/correct, 0 for incorrect
522
259
422
Accurate
Okay
Incorrect
![Page 46: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/46.jpg)
Top K classes, F-measure
Yago rank 1
Yago rank 2
Yago rank 3
dbp rank 1
dbp rank 2
dbp rank 3
GOOG IIT-B
F-Measure 0.688360450563
204
0.487568135310
842
0.438263244010
532
0.677154394707
449
0.626750158305
777
0.612644415917
844
0.67 0.56
0.05
0.15
0.25
0.35
0.45
0.55
0.65
0.75
Column Header Annotations
F-Measure
![Page 47: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/47.jpg)
Entity linking accuracy
http://dbpedia.org/resource/Allen_IversonAllen Iverson
Correctly Linked Entities
3022
Incorrectly Linked Entities
959
Total Entities 3981
Accuracy : 75.91 %
![Page 48: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/48.jpg)
Other Challenges• Using table captions and other text is
associated documents to provide context• Size of some data.gov tables (> 400K rows!)
makes using full graphical model impractical– Sample table and run model on the subset
• Achieving acceptable accuracy may require human input– 100% accuracy unattainable automatically– How best to let humans offer advice and/or
correct interpretations?
![Page 49: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/49.jpg)
Final Conclusions
• Linked data great for sharing structured and semi-structured data– Backed by machine-understandable semantics– Uses successful Web languages and protocols
• Generating and exploring linked data resources is challenging– Schemas are too large, too many URIs
• New tools mapping tables to linked data and translating structured natural language queries reduce the barriers
![Page 50: The Semantic Web: there and back again Tim Finin University of Maryland, Baltimore County Joint work with Lushan Han, Varish Mulwad, Anupam Joshi .](https://reader036.fdocuments.us/reader036/viewer/2022070411/56649f2f5503460f94c48e60/html5/thumbnails/50.jpg)
http://ebiq.org/