A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
-
Upload
mohsen-taheriyan -
Category
Education
-
view
1.013 -
download
0
description
Transcript of A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
![Page 1: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/1.jpg)
A Graph-based Approach to Learn Semantic Descriptions of Data Sources
Mohsen Taheriyan
Craig Knoblock
Pedro Szekely
Jose Luis Ambite
![Page 2: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/2.jpg)
Problem: How to learn semantic descriptions?
![Page 3: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/3.jpg)
First, what is a semantic description?
![Page 4: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/4.jpg)
4
Semantic DescriptionDescribing the source in terms of the concepts and relationships
defined by the domain ontology
Source
object propertydata propertysubClassOf
Domain Ontology
Person
Organization
Place
Statename
birthdatebornIn
worksFor state
name
phone
namelivesIn
CityEvent
ceolocation
organizer
nearby
startDate
endDatetitle
isPartOf
postalCode
Column 1 Column 2 Column 3 Column 4 Column 5Bill Gates Oct 1955 Microsoft Seattle WA
Mark Zuckerberg May 1984 Facebook White Plains NYLarry Page Mar 1973 Google East Lansing MI
![Page 5: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/5.jpg)
5
Semantic Types
Column 1 Column 2 Column 3 Column 4 Column 5
Bill Gates Oct 1955 Microsoft Seattle WA
Mark Zuckerberg May 1984 Facebook White Plains NY
Larry Page Mar 1973 Google East Lansing MI
Person Organization City State
name birthdate name namename
Person
![Page 6: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/6.jpg)
6
Relationships
Column 1 Column 2 Column 3 Column 4 Column 5
Bill Gates Oct 1955 Microsoft Seattle WA
Mark Zuckerberg May 1984 Facebook White Plains NY
Larry Page Mar 1973 Google East Lansing MI
Person
Organization
City
State
name birthdate
bornIn
worksForstate
name
name
name
This semantic model is converted to a semantic description in R2RML
![Page 7: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/7.jpg)
Previous approach to learn semantic descriptions
![Page 8: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/8.jpg)
8
Karma
Domain Ontology
Sample Data
LearnSemantic
Types
CRF
ExtractRelationships
Steiner Tree
Semantic Model
http://www.isi.edu/integration/karma @KarmaSemWeb
![Page 9: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/9.jpg)
9
Refining The Model
Initial Model
![Page 10: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/10.jpg)
10
Refining The Model
Refined Model
• Previous work does not learn the changes done by the user in relationships
• User has to go through the refinement process each time
![Page 11: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/11.jpg)
Our new approach to learn semantic descriptions
![Page 12: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/12.jpg)
12
Key Idea
• Sources in the same domain often have similar data
• Exploit knowledge of existing source models
• Leverage relationships in known source models to hypothesize relationships for new sources
![Page 13: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/13.jpg)
13
Approach
LearnSemantic
Types
CRF
S1 S2 Sn
Known Source Models
…Inputs
Generate Candidate Models Rank Results
Domain Ontology New Source
Construct Graph G
![Page 14: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/14.jpg)
14
Example
Person
Organization
City State
name birthdate
bornIn
worksFor
state
name
namename
name| city|birthdate| state|workplace
S1 = personalInfo
CityState
state
namename
state | cityS2 = getCities
Person
Organization
CityState
name
ceo
isPartOf
name
namename
company| city|ceo| state
S3 = businessInfo
location
Known Source Models
Domain Ontology
New Source
S4 = postalCodeLookup(zipcode, city, state)
![Page 15: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/15.jpg)
15
Build a Graph from Known Models
S1 = personalInfo
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1} {s1}
{s1}
{s1}{s1}
{s1}
Component 1
• Create a component in G for each known source model– Only add if the model is not subgraph of an existing component
• Annotate links with list of supporting models
![Page 16: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/16.jpg)
16
Build a Graph from Known Models
S1 = personalInfo
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
S2 = getCities
Component 1
• Create a component in G for each known source model– Only add if the model is not subgraph of an existing component
• Annotate links with list of supporting models
![Page 17: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/17.jpg)
17
Build a Graph from Known Models
S1 = personalInfo
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
S2 = getCities S3 = businessInfo
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Component 1 Component 2
• Create a component in G for each known source model– Only add if the model is not subgraph of an existing component
• Annotate links with list of supporting models
![Page 18: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/18.jpg)
18
• Connect graph components using all paths inferred from the ontology
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
Build a Graph from Known Models
isPartOf
![Page 19: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/19.jpg)
19
• Assign low weight = ε to links within a component (black links)
• Weight other links according to their (green links)
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
Build a Graph from Known Models
M = known source modelsWmax = number of links in M (>= |EG|) = 18c1(e) = number of links in M whose <label,source, target> match ec2(e) = number of links in M whose <label> match ewe = Min(Wmax - c1 , Wmax - c2/Wmax)
18
17
17
17.9418
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 20: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/20.jpg)
20
Learn Semantic Types (Previous Work)
• A CRF-based model to assign a Semantic Type to each column from its data
• Semantic Type
– Ontology Class– Data Property + Domain
Domain Ontology
(zipcode , city , state)S4 = postalCodeLookup
Place.postalCode City.name State.name
![Page 21: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/21.jpg)
21
Generate Candidate Models
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
17.9418
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
• Map learned semantic types to nodes in graph G– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 22: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/22.jpg)
22
Generate Candidate Models • Map learned semantic types to nodes in graph G
– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 1
17.94
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 23: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/23.jpg)
23
Generate Candidate Models
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
17.9418
Place.postalCode
postalCode
• Map learned semantic types to nodes in graph G– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 1
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 24: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/24.jpg)
24
Generate Candidate Models • Map learned semantic types to nodes in graph G
– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
City State
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 2
17.94
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 25: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/25.jpg)
25
Generate Candidate Models
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
City State
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
17.94
17.9418
Place.postalCode
postalCode
• Map learned semantic types to nodes in graph G– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 2
17.94
17.94
17.94 isPartOf
17.94
![Page 26: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/26.jpg)
26
Generate Candidate Models • Map learned semantic types to nodes in graph G
– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 3
17.94
17.94
17.94
17.94
17.94 isPartOf
17.94
![Page 27: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/27.jpg)
27
Generate Candidate Models
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
CityState
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
isPartOf
• Map learned semantic types to nodes in graph G– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 3
17.94
17.94
17.94
17.94
17.94
17.94
![Page 28: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/28.jpg)
28
Generate Candidate Models • Map learned semantic types to nodes in graph G
– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
City State
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 4
isPartOf
17.94
17.94
17.94
17.94
17.94
17.94
![Page 29: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/29.jpg)
29
Generate Candidate Models
Person
Organization
City State
namebirthdate
bornInworksFor
state
name
name
name
Person.name City.name
Person.birthdate
State.name
Org.name{s1}
{s1}
{s1,s2} {s1,s2}
{s1}
{s1}{s1}
{s1,s2}
Person
Organization
City State
name
ceo
isPartOf
namename
name
location
Org.name
Person.name
City.nameState.name
{s3}{s3}
{s3}
{s3}
{s3}
{s3}
{s3}
Event
Place
location
organizer
organizer
location
location
ceo
worksFor
isPartOf
isPartOf
isPartOf
18
17
17
18
Place.postalCode
postalCode
• Map learned semantic types to nodes in graph G– There might be multiple mappings
• Compute Steiner tree (minimal tree) for each mapping
(zipcode, city, state)S4 = postalCodeLookup
Place.postalCode City.name State.name
Mapping 4
isPartOf
17.94
17.94
17.94
17.94
17.94
17.94
![Page 30: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/30.jpg)
30
Rank Source Models• Rank the candidates based on:
– Cost: sum of the weights– Coherence: prefer the models with higher number of supporting models
Place
City State
postalCode
isPartOfstate
namename
Place.postalCode
City.name
State.name
{s1,s2} {s1,s2}
{s1,s2}
PlaceCity
State
postalCode
isPartOf
isPartOf
namename
Place.postalCode
City.name
State.name
{s1,s2} {s3}
Place
City State
postalCode
isPartOfisPartOf
namename
Place.postalCode
City.name
State.name
{s3} {s3}
{s3}
PlaceCity
State
postalCode
isPartOf
isPartOf
namename
Place.postalCode
City.name
State.name{s3}
{s1,s2}
Rank 1: Candidate 1 Rank 2: Candidate 4
Rank 3: Candidate 2 Rank 3: Candidate 3
![Page 31: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/31.jpg)
31
Evaluation• Dataset 1
– 17 data sources containing overlapping data– Semantic descriptions created manually using DBPedia, FOAF,
GeoNames, and WGS84 ontologies
• Dataset 2– 6 museum sources– Semantic descriptions created by domain experts using EDM,
SKOS, and FOAF ontologies
• Learned a source model assuming other models as input• Computed the Graph Edit Distance (GED) between the learned
model and the correct one – Operations: node insertion, node deletion, edge insertion, edge
deletion, edge relabeling
• Compared the results with our previous work in Karma
![Page 32: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/32.jpg)
32
Results - Dataset 1
Source Signature #Attributes
GED
Previous work
New Approach(Rank 1)
nearestCity(lat, lng, city, state, country) 5 6 1findRestaurant(zipcode, restaurantName, phone, address) 4 1 0zipcodesInCity(city, state, postalCode) 3 3 1parseAddress(address, city, state, zipcode, country) 5 6 1citiesOfState(state, city) 2 1 0ocean(lat, lng, name) 3 2 1postalCodeLookup(zipCode, city, state, country) 4 6 1country(lat, lng, code, name) 4 2 0companyCEO(company, name) 2 1 0personalInfo(firstname, lastname, birthdate, brithCity, birthCountry) 5 4 1businessInfo(company, phone, homepage, city, country, name) 6 10 8restaurantChef(restaurant, firstname, lastname) 3 2 1findSchool(city, state, name, code, homepage, ranking, dean) 7 8 6employees(organization, firstname, lastname, birthdate) 4 1 2education(person, hometown, homecountry, school, city, country) 6 9 4administrativeDistrict(city, province, country) 3 4 1capital(country, city) 2 2 1TOTAL 68 68 29
57% improvement
![Page 33: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/33.jpg)
33
Results - Dataset 2
Source Signature #Attributes
GED
Previous work
New Approach(Rank 1)
S1(Attribution, BeginDate, EndDate, Title, Dated, Medium, Dimensions) 7 1 0
S2(ObjectID, ObjectTitle, ObjectWorkType, ArtistName, ArtistBirthDate, ArtistDeathDate, ObjectEarliestDate, ObjectRights, ObjectFacetValue1)
8 2 3
S3(death, birth, name) 3 0 0
S4(accessionNumber, artist, creditLine, dimensions, imageURL, materials, relatedArtworksURL, creationDate, provenance, keywordValues)
10 9 6
S5(AccessionNumber, Classification, CreditLine, Date, Description, DimensionsOrphan, WhatValues, Who, image, relatedArtworksValues)
10 9 5
S6(Artist, ArtistBornDate, ArtistDiedDate, Classification, Copyright, CreditLine, Image, KeywordValues, Ref, SitterValues) 10 8 6
TOTAL 68 29 20
31% improvement
![Page 34: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/34.jpg)
34
Related Work• Writing semantic descriptions by hand
– R2RML, SWRL– Tedious and time-consuming task– Requires expertise in SW technologies
• Semantic annotation of Web services and Web tables– Very limited in learning the relationships
• Learning Semantic Definitions of Online Information Sources [Carman, Knoblock, 2007]– Learns LAV rules from known sources– Can only learn descriptions that overlap known sources
![Page 35: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/35.jpg)
35
Discussion
• Automatically build rich semantic descriptions of data sources
• Exploit the background knowledge from (i) the domain ontology, and (ii) the known source models
• Semantic descriptions are the key ingredients to automate many tasks, e.g., – Source Discovery – Data Integration– Service Composition
![Page 36: A Graph-Based Approach to Learn Semantic Descriptions of Data Sources](https://reader035.fdocuments.us/reader035/viewer/2022070312/5539c40b4a79594f6c8b4a09/html5/thumbnails/36.jpg)
36
Future Work• Investigate how to create a more compact graph
– Consolidate the overlapping segments of the known semantic models
• Relax the problem by removing the constraint that the correct semantic type of each attribute is known– CRF part returns a set of candidate semantic types along with their
confidence values
• Use the data available in Linked Open Data (LOD) cloud to learn more accurate models
• Put the user in the loop– Integrate the new approach into Karma
– The user refines one of the suggested models
– The new model will be added to the graph as a new pattern