Towards Transfer Learning of Link Speci�cations
Axel-Cyrille Ngonga Ngomo Jens Lehmann Mofeed Hassan
2013-09-16
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 1 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 2 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 3 / 29
Why Link Discovery?
1 Fourth Linked Dataprinciple
2 Links are central for
Cross-ontology QAData IntegrationReasoningFederated Queries...
3 2011 topology of theLOD Cloud:
31+ billion triples≈ 0.5 billion linksowl:sameAs in mostcases
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 4 / 29
Why is it di�cult?
De�nition (Link Discovery)
Given sets S and T of resources and relation RTask: Find M = {(s, t) ∈ S × T : R(s, t)}Common approaches:
Find M ′ = {(s, t) ∈ S × T : σ(s, t) ≥ θ}Find M ′ = {(s, t) ∈ S × T : δ(s, t) ≤ θ}
1 Time complexity
Large number of triplesQuadratic a-priori runtime69 days for mapping cities fromDBpedia to Geonames (1ms percomparison)Decades for linking DBpedia and LGD. . .
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 5 / 29
Why is it di�cult?
De�nition (Link Discovery)
Given sets S and T of resources and relation RTask: Find M = {(s, t) ∈ S × T : R(s, t)}Common approaches:
Find M ′ = {(s, t) ∈ S × T : σ(s, t) ≥ θ}Find M ′ = {(s, t) ∈ S × T : δ(s, t) ≤ θ}
1 Time complexity
Large number of triplesQuadratic a-priori runtime69 days for mapping cities fromDBpedia to Geonames (1ms percomparison)Decades for linking DBpedia and LGD. . .
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 5 / 29
Why is it di�cult?
2 Complexity of speci�cations
Combination of several attributes required for high precisionTedious discovery of most adequate mappingDataset-dependent similarity functions
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 6 / 29
LIMES Framework
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 7 / 29
Link Speci�cation
Detection of accurate link speci�cation is keyLink Speci�cations has three components:
Two sets of restrictions RS1... RS
m resp. RT1... RT
kthat specify the
sets S resp. T ,A speci�cation of a complex similarity metric σ via the combination ofseveral atomic similarity measures σ1, ..., σn andA set of thresholds τ1, ..., τn such that τi is the threshold for σi .
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 8 / 29
Transfer Learning
Different Linking Tasks
Classical Learning of Link Specs Transfer Learning of Link Specs
Learning System Learning SystemLearning System
Current Linking Task
Transfer Learning System
spec accuracy: α
Task Repository
class similarity: ζproperty similarity: π
In our approach we use Transductive Transfer Learning
Class and property matching is assumed to be known already(numerous approaches from ontology matching can be employed) -the goal is to �nd the complex similarity metric
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 9 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 10 / 29
Transfer Learning Framework I
Transfer Learning of link speci�cations is reduce to three subproblems:
Restrictions/class similarity ζ : 2C × 2C 7→ [0, 1]e.g. ζ({City ,Village}, {Town}) = 0.6
Property similarity: ξ : 2P × 2P 7→ [0, 1]e.g. ξ({rdfs : label}, {rdfs : label}) = 1.0
Accuracy of link speci�cations: α : Q 7→ [0, 1]
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 11 / 29
Transfer Learning Framework II
Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′
L)(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarityname-based similarity (URI similarity)data-centric similarity
Properties similarities ξ are de�ned analogously
Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)
Spec can be transferred by replacing properties with most similarproperties in PL and P ′
L
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29
Transfer Learning Framework II
Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′
L)(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarityname-based similarity (URI similarity)data-centric similarity
Properties similarities ξ are de�ned analogously
Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)
Spec can be transferred by replacing properties with most similarproperties in PL and P ′
L
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29
Transfer Learning Framework II
Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′
L)(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarityname-based similarity (URI similarity)data-centric similarity
Properties similarities ξ are de�ned analogously
Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)
Spec can be transferred by replacing properties with most similarproperties in PL and P ′
L
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29
Transfer Learning Framework II
Overall similarity measure for transfer learning:ω(t, t ′) = α(q′) · ζ(ψ(q′), C) · ζ(ψ′(q′), C′) · ξ(sp(q′),PL) · ξ(tp(q′),P ′
L)(details in paper)
Each similarity measure can be implemented in manifold approaches
Implementations of class similarity function ζ in framework:
label-based similarityname-based similarity (URI similarity)data-centric similarity
Properties similarities ξ are de�ned analogously
Similarities between single classes/properties can be extended to sets(e.g. using arithmetic / geometric mean of max. similarity)
Spec can be transferred by replacing properties with most similarproperties in PL and P ′
L
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 12 / 29
Example (New Link Task)
Example link speci�cation for mapping drugs in two datasets DBpedia andDrugbank (DBpedia-Drugbank.xml):
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 13 / 29
Example (Restriction part)
Three parts of link specs:
Restrictions part
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 14 / 29
Example (Properties Part)
Three parts of link specs:
Restrictions partProperties part
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 15 / 29
Example (Similarities Measures Part)
Three parts of link specs:
Restrictions part
Properties part
Similarity Measures part: similarity metric and thresholds
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 16 / 29
Example (Link Repository)
Transfer learning is applied using a repository → restrictions and relevantproperties are assumed to be known → �nd the similarity measure bycomparing with all specs in the repository, e.g. DBpedia-SiderDrugs.xml
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 17 / 29
Example (Restriction Similarities)
Restrictions in both speci�cations �les
Type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml
Source rdf:type dbpedia-owl:Drug rdf:type dbpedia-owl:DrugTarget rdf:type drug:drugs rdf:type sider:drugs
Straightforward label/URI similarityFor instance, trigram metric in URI similarity without pre�xes:
ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0ζ({sider:drugs}, {drug:drugs}) = 1.0
Data-centric: ζd (s, s′) = 1
|P(s)||P(s′)|∑
x∈P(s)
∑y∈P(s′)
sim(x , y) where
P(s) = {x : s p x ∧ p rdf:type owl:DatatypeProperty}(extends similarity to instances)
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 18 / 29
Example (Restriction Similarities)
Restrictions in both speci�cations �les
Type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml
Source rdf:type dbpedia-owl:Drug rdf:type dbpedia-owl:DrugTarget rdf:type drug:drugs rdf:type sider:drugs
Straightforward label/URI similarityFor instance, trigram metric in URI similarity without pre�xes:
ζ({dbpedia-owl:Drug}, {dbpedia-owl:Drug}) = 1.0ζ({sider:drugs}, {drug:drugs}) = 1.0
Data-centric: ζd (s, s′) = 1
|P(s)||P(s′)|∑
x∈P(s)
∑y∈P(s′)
sim(x , y) where
P(s) = {x : s p x ∧ p rdf:type owl:DatatypeProperty}(extends similarity to instances)
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 18 / 29
Example (Property Similarities)
type DBpedia-Drugbank.xml DBpedia-SiderDrugs.xml
Source rdfs:label rdfs:labelfoaf:name
Target rdfs:label rdfs:labeldrug:genericName
Applying similarity function to all properties:For instance trigram based on URIs and arithmetic mean asaggregation:ξ({rdfs : label}, {rdfs : label , foaf : name}) = 0.9ξ({rdfs : label , drug : genericName}, {rdfs : label}) = 0.8
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 19 / 29
Example (Overall Similarity)
Based on, e.g. F-score assign quality value to q′ =DBpedia-SiderDrugs.xml, in our case α(q′) = 0.89
The �nal step is calculating the overall similarity measureω(DBpedia − Drugbank .xml ,DBpedia − SiderDrugs.xml) =
0.89 * 1.0 * 1.0 * 0.9 * 0.8 = 0.64
The steps are repeated for all link speci�cations in the repository
Most similar link spec can be transferred by replacing its propertieswith the most similar ones in the computed property matching
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 20 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 21 / 29
Experimental Setup I
The goal of evaluation is two-fold:
Evaluating whether transfer learning can be used to build templatesfor link spec
Discover whether the transferred templates can be used directly
113 speci�cations were retrieved from LATC, each has manual linksevaluation
10% 2%
66%
3%
1% 3%
15%
Persons
Events
Locations
Diseases
Drugs
Organizations
Misc
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 22 / 29
Experimental Setup II
Leave-one-out evaluation
1.) Compare top-scored speci�cation (most similar) and checkwhether it uses the same combination of similarity functions � assign 1for match and 0 for no match
2.) Compute F-measure of learned link specs directly � works only onspecs with both endpoints alive (only 12 out of 113)
Used URI similarity
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 23 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 24 / 29
First Experiments Set Results
Detecting right speci�cation in 81% of all cases
In geo-spatial domain 91%
In persons domain 58%
Averag
ePer
sons
Events
Locatio
nsDise
ases
Drugs
Organiz
ations Mis
c0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 25 / 29
Second Experiments Set Results
In the second Experiments series, source and target endpoints need tobe alive such that we can execute transferred link spec (12 out of 113)
In general low F-measures
dblp-datasemanticweb-researcher
euraxess-eures-country
rkbcrime-dbpedia-constabularies
dbpedia-datagovUK-city
eventseer-dblp_l3s-person
eventseer-dogfood-event
dbpedia-linkedgeodata-airport
stad-rmon-person
eventseer-dogfood-person
dbpedia-linkedgeodata-university
dbpedia-gutendata-texts
dbpedia-openei-country
0%
20%
40%
60%
80%
100%
PrecisionRecallF-Measure
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 26 / 29
Outline
1 Motivation
2 Transfer Learning Framework
3 Experimental Setup
4 Results
5 Conclusions and Future Work
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 27 / 29
Summary
Conclusions:
Detecting right template in 81% of all cases
Transfer learning cannot replace the learning of thresholds inspeci�cations
Future Work:
Combination with machine-learning approaches for link speci�cations(e.g., EAGLE, COALA), in particular for learning thresholds
More sophisticated class and property similarity approaches
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 28 / 29
The End
Jens [email protected]/Uni Leipzig
Questions GeoKnow
http://geoknow.eu
Ngonga et. al (Univ. Leipzig) Transfer Learning of Link Specs 2013-09-16 29 / 29
Top Related