Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation...
-
Upload
danica-damljanovic -
Category
Technology
-
view
1.015 -
download
1
description
Transcript of Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation...
![Page 1: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/1.jpg)
Linked Data-‐‑based Concept Recommendation: Comparison of Different Methods in Open
Innovation Scenario Danica Damljanovic, Milan Stankovic,
Philippe Laublet
![Page 2: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/2.jpg)
Innovation
![Page 3: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/3.jpg)
Innovation Platforms
Challenge: Promote innovation problems to an audience of solvers who can propose relevant innovative solutions
![Page 4: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/4.jpg)
Finding Meaningful Connec0ons
Clay mining …
Kaolinite extrac0on from
rocks …
Different communi-es use different terms and concepts to speak about seman-cally related things. Such “language” defines communi-es and separates them. Being able to find
meaningful connec-ons between concepts would enable us to build bridges between people and content.
h;p://bit.ly/hyProximity
![Page 5: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/5.jpg)
Concept recommenda0on • Concepts you might not know but might want to use: to annotate
your content, to search for content, to search for people… • Help problem promoters discover relevant concepts (problem
promoters some0mes not field experts) • Discovery = relevance + unexpectedness
h;p://bit.ly/hyProximity
![Page 6: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/6.jpg)
• HyProximity, a structure-based similarity • Structure-based Statistical Semantics Similarity
Random Indexing, a well-known statistical semantics from Information Retrieval to RDF
Discovering Direct and Lateral Concepts
![Page 7: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/7.jpg)
Linked Data-‐based Concept Recommenda0on
Zemanta Textual Input
DBPedia Concepts found in the text
DBPedia Exploration suggestions
h;p://bit.ly/hyProximity
![Page 8: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/8.jpg)
hyProximity
• We start from several seed concepts found directly in the text, and search the DBPedia graph
• The concepts found in the proximity of several seed concepts are considered more “in context” for the given input
• Concepts found at a shorter distance from the seed concepts have higher hyProximity
![Page 9: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/9.jpg)
• Hierarchical: exploring skos:broader rela9ons • Transversal: exploring transversal links • mixed: a linear combina0on of hierarchical and transversal
Different Distance Func0ons skos:broader
other property
2 2 2 2+1
research.hypios.com/hyproximity
Paris Seine
Rivers in France Cities in France
Things in France
Products of France
Marne Chanel
Car Industry
BMW Peugeot
![Page 10: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/10.jpg)
Different Distance Func0ons
“fashion” 1 1
research.hypios.com/hyproximity
1
Paris Seine
Rivers in France Cities in France
Things in France
Products of France
Marne
Car Industry
BMW Peugeot Chanel
flows through competitor
skos:broader
other property
famous for
• Hierarchical: exploring skos:broader rela0ons • Transversal: exploring transversal links • Mixed: a linear combina0on of hierarchical and transversal
![Page 11: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/11.jpg)
Random Indexing • Words which appear in the similar context - with the
same set of other words - are contextually related e.g. synonyms.
• Synonyms tend not to co-occur with one another directly, so indirect inference is required to draw associations between words used to express the same idea
![Page 12: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/12.jpg)
Two steps to Random Indexing
• Indexing o For an RDF graph, generate virtual documents o Prepare the corpus (pre-processing) o Generate semantic index
• Search - given a term X calculate a cosine similarity between the vector of that term and other vectors in the semantic space
![Page 13: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/13.jpg)
Building context vectors
d1 0 0 -‐‑1 1 -‐‑1 1
d2 -‐‑1 1 0 0 1 -‐‑1
… dp 0 1 0 -‐‑1 -‐‑1 1
d1 d2 .. dp t1 1 2 .. 0
t2 3 0 .. 0
.. .. .. .. ..
tq 0 1 10
t1 t2 … tq
X =
Dimensionality = n
Seed length
M
D
T
![Page 14: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/14.jpg)
Indexing: virtual documents
14
S
O2
O1
L7
P7
L3
L2
L1
P4
L4
P1
P2
P3
L8
L6
L5
P10 P9 P8
lexicalise
Representative subgraph for URI=S Virtual document for URI=S
P5 P6
P1 S P2 L2 S L1
S P3 L3
S
L5
P4 P5 L4 O1 S P4 O1 P6 S L6 S
L8
P7
P7 P9 O2
L7 P8
O2 S P7 O2 P10
S P7 O2 S P4 O1
![Page 15: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/15.jpg)
Experiments • 26 real innovation problems from Hypios • Measure of success: the suggested concepts
appear in the actual solutions (precision, recall, f-measure)
(+) reasonable list of concepts from real scenarios (-) not complete:
o User study: measure discovery = relevance+unexpectedness
![Page 16: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/16.jpg)
DBpedia Dataset • Select a number of properties relevant to the Open
Innovation-related scenario • dbo:product, dbp:pruducts, dbo:industry,
dbo:service, dbo:genre, and properties serving to establish a hierarchical categorization of con- cepts, namely dc:subject and skos:broader
![Page 17: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/17.jpg)
Evaluation • “Gold standard”
o Extract problem URIs o Extract solution URIs
• Baseline: o Google Adwords Keyword Tool: finds similar
topics based on their distribution in textual corpora and the corpora of search queries.
o Suggesting up to 600 concepts which are then used for Web crawling for finding experts.
![Page 18: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/18.jpg)
Evaluation: Results
! !
!!
![Page 19: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/19.jpg)
User Study • Suggestions being both relevant and unexpected
o the most valuable discoveries for the user • 12 users • 34 problem evaluations
o 3060 suggested concepts/keywords.
• For the chosen innovation problem, the evaluators were presented with the lists of 30 top-ranked suggestions generated by adWords, hyProximity (mixed approach) and Random Indexing.
![Page 20: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/20.jpg)
Example
![Page 21: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/21.jpg)
User Study: Results
![Page 22: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/22.jpg)
Conclusion • Linked Data valuable source of knowledge for
concept recommendation • Our two methods complementary
o hyProximity better for precision o Random Indexing better for recall
• User study: unexpectedness higher with our methods than with baseline
• Subjective user comment: o Random Indexing: generic o hyProximity: granular o adWords: redundant
![Page 23: Linked Data-based Concept Recommendation: Comparison of Different Methods in Open Innovation Scenario](https://reader033.fdocuments.us/reader033/viewer/2022060118/558bd4d6d8b42abc158b456f/html5/thumbnails/23.jpg)
Thank You! • Find out more: • http://research.hypios.com/?page_id=165
Contact us: • Danica Damljanovic @dancheeee • Milan Stankovic: @milstan