Tutorial Cognition - Irene
-
Upload
sssw -
Category
Technology
-
view
387 -
download
0
Transcript of Tutorial Cognition - Irene
COGNITION FOR THE SEMANTIC WEB 1
Cognition for the Semantic WebIrene Celino + Tomi Kauppinen
SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 2
Cognition… what? Cognitive Computing refers to systems that learn at scale and interact naturally with people to extend what humans or machines can do alone
(Eleni Pratsini – IBM Research, ESWC 2016 Keynote)
Cognitive Systems studies aim at (1) gaining a deeper understanding of how humans realize their unique intelligent potential and (2) developing artificial cognitive agents to assist humans in performing demanding tasks in order to strengthen their autonomy
(Cognitive Systems Group, University of Bremen)
SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 3
This morning tutorial: Cognition for the Semantic Web Irene will talk about “Involving Humans in Semantic Data Management”
methods, techniques and tools to include humans in a Semantic Web pipeline
leverage humans to develop better Semantic Web systems
Tomi’s talk title is “Let us build cognitive & semantic systems to support understanding”
via visual examples on why and how to make sense of data and to understand underlying phenomena
methods, evaluation, discoveries
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 4
Cognition for the Semantic WebInvolving Humans in Semantic Data Management
Irene Celino ([email protected]) CEFRIEL – Milano, Italy
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 5
Agenda Methods to involve people (a.k.a. crowdsourcing and its brothers)
Motivation and incentives (a.k.a. let’s have fun with games) Crowdsourcing and the Semantic Web (a.k.a. this is SSSW after all…)
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 6
Methods to involve people
What goals can humans help machines to achieve? Which scientific communities “exploit” people? How to involve a crowd of persons?
- Citizen Science - Crowdsourcing - Human Computation
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 7
Wisdom of crowds [1]
“Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations”
Criteria for a wise crowd Diversity of opinion (importance of interpretation) Independence (not a “single mind”) Decentralization (importance of local knowledge) Aggregation (aim to get a collective decision) The are also failures/risks in crowd decisions: Homogeneity, centralization, division, imitation, emotionality
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 8
Citizen Science [2]
SSWS 2016 - Bertinoro, 22th July 2016
Problem: a scientific experiment requires the
execution of a lot of simple tasks, but researchers are
busySolution: engage the general
audience in solving those tasks, explaining that they are
contributing to science, research and the public good
Example: https://www.zooniverse.org/
Cognition for the Semantic Web 9
Crowdsourcing [3]
SSWS 2016 - Bertinoro, 22th July 2016
Problem: a company needs to execute a lot of simple tasks,
but cannot afford hiring a person to do that job Solution: pack tasks in
bunches (human intellingence tasks or HITs) and outsource
them to a very cheap workforce through an online
platform
Example: https://www.mturk.com/
Cognition for the Semantic Web 10
Human Computation [4]
SSWS 2016 - Bertinoro, 22th July 2016
Problem: an Artificial Intelligence algorithm is
unable to achieve an adequate result with a satisfactory level
of confidence
Solution: ask people to intervene when the AI system
fails, “masking” the task within another human process
Example: https://www.google.com/recaptcha/
Cognition for the Semantic Web 11
Spot the difference… Similarities: - Involvement of people - No automatic replacement
Variations: - Motivation - Reward (glory, money, passion)
Hybrids or parallel!
SSWS 2016 - Bertinoro, 22th July 2016
Citizen Science
Crowdsourcing
Human Computation
Cognition for the Semantic Web 12
Motivation and incentives
Apart from extrinsic rewards (money, prizes, etc.) what are the intrinsic incentives we can adopt to motivate people? How can we leverage “fun” through games and game-like applications? - Gamification - Games with a Purpose (GWAP)
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 13
Gamification [5,6]
SSWS 2016 - Bertinoro, 22th July 2016
Problem: motivate people to execute boring or mandatory
tasks (especially in cooperative environments)
that they are usually not very happy to do
Solution: introduce typical game elements (e.g. points, badges, leaderboards) within
the more traditional processes and systems
Example: https://badgeville.com/
Cognition for the Semantic Web 14
Games with a Purpose (GWAP) [7,8]
SSWS 2016 - Bertinoro, 22th July 2016
Problem: same of Human
Computation (ask humans when AI
fails)
Solution: hide the task within a game, so that users are
motivated by game challenges, often remaining unaware of the hidden purpose, task solution
comes from agreement between players
Example: http://www.gwap.com/
Cognition for the Semantic Web 15
Crowdsourcing and the Semantic Web
Can we involve people in Semantic Web systems? What semantic data management tasks can we effectively “outsource” to humans?
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 16
Why Crowdsourcing in the Semantic Web? Knowledge-intensive and/or context-specific character of Semantic Web tasks: e.g., conceptual modelling, multi-language resource labelling,
content annotation with ontologies, concept/entity similarity recognition, …
Crowdsourcing can help to engage users and involve them in executing tasks: e.g., wikis for semantic content authoring, folksonomies to
bootstrap formal ontologies, human computation approaches, …
SSWS 2016 - Bertinoro, 22th July 2016
Semantic Crowdsourcing [9] Many tasks in Semantic Web data management/curation can exploit Crowdsourcing
SSWS 2016 - Bertinoro, 22th July 2016Cognition for the Semantic Web 17Fact level
Schema level
Collection Creation CorrectionValidation Filtering Ranking Linking
Conceptual
modelling
Ontology populatio
nQuality
assessment
Ontology re-
engineering
Ontology pruning
Ontology elicitation
Knowledge
acquisition
Ontology repair
Knowledge base update
Data search/
selectionLink
generation
Ontology alignment
Ontology matching
Cognition for the Semantic Web 18
Focus on Data Linking Creation of links in the form of RDF triples (subject, predicate, object) Within the same dataset (i.e. generating new connections between resources of
the same dataset) Across different datasets (i.e. creating RDF links, as named in the Linked Data
world)
Notes: Generated links can have an associated score expressing a sort of “confidence” in the truth
value or in the “relevance” of the triple In literature, data linking often means finding equivalent resources (similarly to record linkage
in database research), i.e. triples with correspondence/match predicate (e.g. owl:sameAs) in the following, data linking is intended in its broader meaning (i.e. links with any predicate)
SSWS 2016 - Bertinoro, 22th July 2016
Subject: {Milano cultural heritage assets} Predicate: foaf:depiction Goal: create <asset> foaf:depiction <photo>
Object: ? {photos}
Knowledge acquisition
SSWS 2016 - Bertinoro, 22th July 2016Cognition for the Semantic Web 19
Citizen science
campaign
http://bit.ly/picmiapp
Level badges as intrinsic
incentives (gamification)
Contributions with attribution/recognit
ion
Set of all links: <asset> foaf:depiction <photo> Goal: assign score to rank links on their “recognisability”/“representativeness”
Link ranking [10]
SSWS 2016 - Bertinoro, 22th July 2016Cognition for the Semantic Web 20
http://bit.ly/indomilando
Pure GWAP with hidden purpose
Points, badges, leaderboard as intrinsic reward
The score is a function of where is the no. of successes (=recogni-tions) and the no. of trials of the Bernoulli process (guess or not guess) realized by the game
Link ranking is a result of the “agreement”
between players
Cognition for the Semantic Web 21
Link validation [11]
SSWS 2016 - Bertinoro, 22th July 2016
http://bit.ly/foss4game Set of links: <land-area> clc:hasLandCover <land-cover>Automatic classifications: <land-cover-assigned-by-DUSAF> ≠ <land-cover-assigned-by-GL30>
Goal: assign score to each link to discover the “right” land cover classPure GWAP with
not-so-hidden purpose (played by
“experts”)Points, badges, leaderboard as intrinsic reward
https://youtu.be/Q0ru1hhDM9Q
A player scores if he/she guess one of the two disagreeing
classifications
Score of each link is updated on the basis of players’ choices (incremented if link selected, decremented if link not selected)When the score of a link overcomes a threshold , the link is considered “true”
Link validation is a result of the “agreement”
between players
Cognition for the Semantic Web 22
Caveat 1: Mice and Men (or: keep it simple) Crowdsourcing workers behave like mice [12] Mice prefer to use their motor skills (biologically cheap, e.g. pressing
a lever to get food) rather than their cognitive skills (biologically expensive, e.g. going through a labyrinth to get food)
Workers prefer/are better at simple tasks (e.g. those that can be solved at first sight) and discard/are worse at more complex tasks (e.g. those that require logics)
Crowdsourcing tasks should be carefully designed Tasks as simple as possible for the workers to solve Complex tasks together with other incentives (e.g. variety/novelty)
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 23
Suggestion 1: Divide et impera (or: Find-Fix-Verify)
SSWS 2016 - Bertinoro, 22th July 2016
Find-Fix-Verify crowd programming pattern [13] A long and “expensive” task… Summarize a text to shorten its total length …is decomposed in more atomic tasks… 1. find sentences that need to be shortened 2. fix a sentence by shortening it 3. verify which summarized sentence maintains original meaning …and the complex task is turned into a workflow of simple
tasks, and each step is outsourced to a crowd
Cognition for the Semantic Web 24
Caveat 2: aggregation and disagreement Are all contributors “created equal”? Contributions/results on the same task are usually aggregated on
different workers (“wisdom of crowds”, “collective intelligence”) Update formula for score should weight contributions differently
by including some evaluation of contributors’ reliability (e.g. gold standard)
Is there always a “right answer”? Or is there a “crowd truth”? [14] Not always true/false, because of human subjectivity, ambiguity
and uncertainty Disagreement across contributors is not necessarily bad, but a
sign of: different opinions, interpretations, contexts, perspectives, …
SSWS 2016 - Bertinoro, 22th July 2016
Cognition for the Semantic Web 25
Suggestion 2: compare and contrast From “wisdom of crowds” to “wisdom of the crowdsourcing methods”: different approaches to solve the same problem could be put in parallel to compare results
Which is the best crowdsourcing approachfor a specific use case?
Which is the most suitable crowd? Is crowdsourcing better/faster/cheaper
than automatic means (e.g. AI)? Examples: Detecting quality issues in DBpedia [15]: find-verify strategy
with different crowds (experts and workers) behind the find step Employing people for ontology alignment [16]: outcomes comparable to results of
OAEI systemsSSWS 2016 - Bertinoro, 22th July 2016
input task
output solutio
n
Citizen Science
Human Computation
Crowdsourcing
Automatic/machine
computation
Cognition for the Semantic Web 26
Bibliography (1/2)
SSWS 2016 - Bertinoro, 22th July 2016
[1] James Surowiecki. The wisdom of crowds, Anchor, 2005. [2] Alan Irwin. Citizen science: A study of people, expertise and sustainable development. Psychology Press, 1995.
[3] Jeff Howe. Crowdsourcing: How the power of the crowd is driving the future of business. Random House, 2008.
[4] Edith Law and Luis von Ahn. Human computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 5(3):1–121, 2011.
[5] Jane Mc Gonigal. Reality is broken: Why games make us better and how they can change the world. Penguin, 2011.
[6] Kevin Werbach and Dan Hunter. For The Win: How Game Thinking Can Revolutionize Your Business. Wharton Digital Press, 2012.
[7] Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006. [8] Luis Von Ahn and Laura Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58–67, 2008.
Cognition for the Semantic Web 27
Bibliography (2/2)
SSWS 2016 - Bertinoro, 22th July 2016
[9] Cristina Sarasua, Elena Simperl, Natasha Noy, Abraham Bernstein, Jan Marco Leimeister. Crowdsourcing and the Semantic Web: A Research Manifesto, Human Computation 2 (1), 3-17, 2015.
[10] Irene Celino, Andrea Fiano, and Riccardo Fino. Analysis of a Cultural Heritage Game with a Purpose with an Educational Incentive, ICWE 2016 Proceedings, pp. 422-430, 2016.
[11] Maria Antonia Brovelli, Irene Celino, Monia Molinari, Vijay Charan Venkatachalam. A crowdsourcing-based game for land cover validation, ESA Living Planet Symposium 2016 Proceedings, 2016.
[12] Panos Ipeirotis. On Mice and Men: The Role of Biology in Crowdsourcing, Keynote talk at Collective Intelligence, 2012.
[13] M. Bernstein, G. Little, R. Miller, B. Hartmann, M. Ackerman, D. Karger, D. Crowell, K. Panovich. Soylent: A Word Processor with a Crowd Inside, UIST Proceedings, 2010.
[14] Lora Aroyo, Chris Welty. Truth is a Lie: 7 Myths about Human Annotation, AI Magazine 2014.
[15] Maribel Acosta, Amrapali Zaveri, Elena Simperl, Dimitris Kontokostas, Fabian Flöck, Jens Lehmann. Detecting Linked Data Quality Issues via Crowdsourcing: A DBpedia Study, Semantic Web Journal, 2016.
[16] Cristina Sarasua, Elena Simperl, Natasha Noy. Crowdmap: Crowdsourcing ontology alignment with microtasks, ISWC 2012 Proceedings, pp. 525-541, 2012.
Cognition for the Semantic Web 28
Other relevant work (1/2)
SSWS 2016 - Bertinoro, 22th July 2016
Cultural heritage annotation: C. Wieser, F. Bry, A. Bérard and R. Lagrange. ARTigo: building an artwork
search engine with games and higher-order latent semantic analysis. In First AAAI Conference on Human Computation and Crowdsourcing, 2013.
Ranking triples for relevance: J. Hees, T. Roth-Berghofer, R. Biedert, B. Adrian and A. Dengel. BetterRelations:
using a game to rate linked data triples. In Annual Conference on Artificial Intelligence, 2011.
I. Celino, E. Della Valle and R. Gualandris. On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Management tasks. In CrowdUI workshop, 2014.
DBpedia data cleansing: J. Waitelonis, N. Ludwig, M. Knuth and H. Sack. WhoKnows? Evaluating Linked
Data Heuristics with a Quiz that cleans up DBpedia. Interactive Technology and Smart Education, 8(4), 2011.
Urban Games on spatial knowledge: I. Celino, S. Contessa, M. Corubolo, D. Dell'Aglio, E. Della Valle, S. Fumeo and T.
Krüger. Linking Smart Cities Datasets with Human Computation – the case of UrbanMatch, In ISWC 2012.
I. Celino, D. Cerizza, S. Contessa, M. Corubolo, D. Dell'Aglio, E. Della Valle and S. Fumeo. Urbanopoly – a Social and Location-based Game with a Purpose to Crowdsource your Urban Data, Proceedings of SoHuman 2012, 2012.
Cognition for the Semantic Web 29
Other relevant work (2/2)
SSWS 2016 - Bertinoro, 22th July 2016
Crowdsourcing ontology engineering tasks or SPARQL querying: N. Noy, J. Mortensen, M. Musen and P. Alexander. Mechanical turk as an
ontology engineer?: using microtasks as a component of an ontology-engineering workflow. In WebSci 2013.
G. Wohlgenannt, M. Sabou and F. Hanika. Crowd-based ontology engineering with the uComp Protégé plugin. Semantic Web, 7(4), 2016.
M. Acosta, E. Simperl, F. Flöck and M.E. Vidal. HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing. K-CAP 2015.
Crowdsourced contributions and crowd quality: R. Kern, H. Thies, C. Zirpins and G. Satzger,. Dynamic and Goal-Based Quality
Management for Human-Based Electronic Services. Int. J. Cooperative Inf. Syst. 21, 1, 2012.
T. Schulze, S. Krug and M. Schader. Workers’ Task Choice in Crowdsourcing and Human Computation Markets. In ICIS Association for Information Systems, 2012.
D. Difallah, G. Demartini and P. Cudré-Mauroux. Pick-a-crowd: tell me what you like, and I'll tell you what to do. In WWW 2013.
M. Feldman and A. Bernstein. Cognition-based Task Routing: Towards Highly-Effective Task-Assignments in Crowdsourcing Settings. In ICIS 2014.
Mixed crowdsourcing approaches: M. S. Bernstein. Crowd-Powered Systems. KI 27, 1, 69–73, 2013.
COGNITION FOR THE SEMANTIC WEB 30
Cognition for the Semantic WebResearch questionsIrene Celino + Tomi Kauppinen
SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 31
Warning on group forming Today is the last day you form groups to investigate research questions
We suggest you to take the chance to work with people you haven’t grouped together yet… diversity is a resource!
Mini-project groups are not allowed!
SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 32
Before getting too serious, let’s seriously play… You will be asked to present your research task results in 3 minutes (which is tough!)
We want to give a chance to win 2 additional minutes to one of the groups… To whom? To the group of the winner of this morning tutorial’s quiz game!!
1. Grab your mobile phone (or tablet or laptop) 2. Be sure you are connected to the internet 3. And finally go to:
KAHOOT.IT SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 33
Research Question Why and how to make sense of linked spatio-temporal data? Can we make machines to be cognitive systems? What is the role of human cognitive capabilities? How can Linked Data / Semantic Web technologies help?
Possible approaches: crowdsourcing, information visualization, storytelling, gamification, sensor data fusion, affordances, …
Possible starting point paper: K. Janowicz, The role of space and time for knowledge organization on the Semantic Web. Semantic Web 1.1, 2 (2010): 25-32.
SSWS 2016 - Bertinoro, 22th July 2016
COGNITION FOR THE SEMANTIC WEB 34
What you have to do Task: devise your own approach Provide a convincing future scenario (i.e. design fiction) as an
example Use your creativity and freedom! Come up with a proposal including motivation (why), approach
(how) and expected results (what); bonus point if you add ideas about evaluation
Put your group results in the shared Etherpad (no PowerPoint!): http://public.etherpad-mozilla.org/p/cognition-day
SSWS 2016 - Bertinoro, 22th July 2016